Open In Colab

Sugar

Sugar and Education

Data Cleaning

CDC 2013 Source

TABLE 3. Crude prevalence* of sugar-sweetened beverage† consumption ≥1 time/day among adults, by employment status, education, and state — Behavioral Risk Factor Surveillance System, 23 states and District of Columbia, 2013

https://www.cdc.gov/mmwr/volumes/65/wr/mm6507a1.htm

import pandas as pd

cdc_2013 = pd.read_csv("https://raw.githubusercontent.com/noahgift/sugar/master/data/education_sugar_cdc_2003.csv")
cdc_2013.set_index("State", inplace=True)
cdc_2013.head()
Employed Not employed Retired <High school High school Some college College graduate
State
Alaska 26.2 (23.6–28.9) 32.1 (27.8–36.8) 16.0 (12.6–20.2) 47.1 (37.8–56.5) 34.9 (31.1–38.9) 24.2 (21.0–27.8) 12.9 (10.5–15.7)
Arizona 33.0 (28.5–37.8) 28.7 (23.5–34.5) 13.8 (10.8–17.5) 40.4 (30.9–50.7) 36.5 (30.7–42.7) 24.4 (19.9–29.4) 14.6 (11.6–18.3)
California 22.9 (20.9–25.1) 30.2 (27.1–33.4) 15.0 (12.2–18.2) 38.5 (34.2–43.0) 29.9 (26.5–33.7) 21.4 (18.8–24.2) 11.5 (9.8–13.5)
Connecticut 18.9 (17.1–20.9) 24.3 (20.8–28.2) 15.0 (12.7–17.7) 27.8 (22.4–33.9) 26.9 (23.7–30.3) 19.9 (17.2–23.0) 10.2 (8.7–12.0)
District of Columbia 18.5 (15.7–21.7) 34.6 (29.5–40.1) 18.5 (15.3–22.1) 45.6 (36.4–55.2) 39.0 (33.1–45.2) 28.9 (23.4–35.0) 8.4 (7.0–10.1)
for column in cdc_2013.columns:
  cdc_2013[column]=cdc_2013[column].str.replace(r"\(.*\)","")
  cdc_2013[column]=pd.to_numeric(cdc_2013[column])
  
cdc_2013.reset_index(inplace=True) 
cdc_2013.head()
  
State Employed Not employed Retired <High school High school Some college College graduate
0 Alaska 26.2 32.1 16.0 47.1 34.9 24.2 12.9
1 Arizona 33.0 28.7 13.8 40.4 36.5 24.4 14.6
2 California 22.9 30.2 15.0 38.5 29.9 21.4 11.5
3 Connecticut 18.9 24.3 15.0 27.8 26.9 19.9 10.2
4 District of Columbia 18.5 34.6 18.5 45.6 39.0 28.9 8.4
cdc_2013.describe()
Employed Not employed Retired <High school High school Some college College graduate
count 24.000000 24.000000 24.000000 24.000000 24.000000 24.000000 24.000000
mean 32.325000 35.408333 18.533333 44.662500 37.416667 30.262500 17.358333
std 9.917803 9.056485 5.975142 8.588658 8.243399 8.490138 6.730264
min 16.700000 21.500000 8.900000 27.800000 21.500000 16.900000 7.800000
25% 23.400000 29.750000 14.625000 39.625000 31.925000 24.200000 12.850000
50% 31.550000 32.600000 16.750000 46.350000 36.750000 28.200000 15.300000
75% 42.025000 46.025000 22.550000 51.200000 46.525000 39.250000 23.500000
max 49.700000 49.500000 29.700000 60.000000 50.800000 47.200000 34.900000

Education and Sugar

!wget https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/us-states.json
!ls -l
--2019-03-20 23:57:49--  https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/us-states.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87688 (86K) [text/plain]
Saving to: ‘us-states.json.2’

us-states.json.2    100%[===================>]  85.63K  --.-KB/s    in 0.02s   

2019-03-20 23:57:50 (3.39 MB/s) - ‘us-states.json.2’ saved [87688/87688]

total 280
drwxr-xr-x 1 root root  4096 Mar  8 17:26 sample_data
-rw-r--r-- 1 root root 87688 Mar 20 22:43 us-states.json
-rw-r--r-- 1 root root 87688 Mar 20 22:43 us-states.json.1
-rw-r--r-- 1 root root 87688 Mar 20 23:57 us-states.json.2

Low Education == High Sugar

import folium
m = folium.Map(location=[36, -102], zoom_start=3)

folium.Choropleth(
    geo_data="us-states.json",
    name='choropleth',
    data=cdc_2013,
    columns=['State', '<High school'],
    key_on='feature.properties.name',
    fill_color='OrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='<High school Education and Grams Sugar Intake Daily'
).add_to(m)

folium.LayerControl().add_to(m)

m

College Education Major Decrease in Sugar Intake

import folium
m = folium.Map(location=[36, -102], zoom_start=3)

folium.Choropleth(
    geo_data="us-states.json",
    name='choropleth',
    data=cdc_2003,
    columns=['State', 'College graduate'],
    key_on='feature.properties.name',
    fill_color='OrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='College graduate and Grams Sugar Intake Daily'
).add_to(m)

folium.LayerControl().add_to(m)

m

Median Daily Sugar Intake by Category

cdc_2003.columns
Index(['State', 'Employed', 'Not employed', 'Retired', '<High school',
       'High school', 'Some college', 'College graduate'],
      dtype='object')

College Graduate

cdc_2003["College graduate"].median()
15.3
cdc_2003["<High school"].median()
46.35
cdc_2003[["State","College graduate", "<High school"]].plot.barh(
    stacked=True, figsize=(12, 12)).set_title("CDC 2013: Three Times Higher Sugar Intake College vs High School Grads")
Text(0.5, 1.0, 'CDC 2013: Three Times Higher Sugar Intake College vs High School Grads')

png