Pexels tiger lily 7108043

CDC BRFSS Data Analysis Timeline

  • Period: to

    Phase One

    Context and Scope
    -Data familiarization
    -Define objectives
    -Data validation
  • Data Familiarization

    Data Familiarization
    Load the dataset into a data analysis framework . Gain familiarity with the data to better comprehend variable metadata. Construct a variable dependency graph.
  • Define objectives

    Define objectives
    *Establish measurement criteria for progress and success.
  • Data validation

    Data validation
    Implement error handling for missing or indeterminate values then validate and correct data anomalies
  • Period: to

    PhaseTwo

    Data cleaning and validation
    -Conduct data normalization or standardization
    -Construct Computed Variables
  • Conduct data normalization or standardization

    Conduct data normalization or standardization
    Utilize Python to concatenate and format date components into a standardized datetime object. Create a datetime object from individual date fields (day, month, year)
  • Construct Computed Variables

    Construct Computed Variables
    Derive BMI from the provided WEIGHT2 and HEIGHT3 measurements. Determine the time interval between the current date and the last checkup (CHECKUP1).
  • Period: to

    Phase three

    Preliminary Data Analysis
    -Statistical Summaries
    -Visual Data Analysis
    -Dependency Analysis
  • Statistical Summaries

    Statistical Summaries
    Utilize Python to calculate descriptive statistics for the numeric variables NUMADULT, WEIGHT2, and HEIGHT3. Conduct categorical data analysis on the categorical variables SMOKE100 and HLTHPLN1.
  • Visual data analysis

    Visual data analysis
    Categorical Data Visualization: Employ bar charts to illustrate the frequency or proportion of categories within a categorical variable, like BPHIGH4 status, across different gender groups. Continuous Data Analysis: Perform a continuous data analysis using boxplots or histograms to assess the central tendency, variability, and shape of the distribution of continuous variables, such as WEIGHT2.
  • Dependency analysis

    Dependency analysis
    Assess the statistical significance of the relationship between EXEROFT1 and BPHIGH4.
  • Period: to

    Phase four

    Prescriptive Analytics
    -Pattern Analysis
    -Stratification
    -Statistical Hypothesis Testing
  • Pattern analysis

    Pattern analysis
    Assess temporal health trends based on interview dates. Conduct demographic comparisons to identify health variations across different population subgroups.
  • Stratification

    Stratification
    Partition the dataset by categorical variables like gender, smoking status, and health plan coverage.
  • Statistical hypothesis testing

    Statistical hypothesis testing
    Utilize statistical inference to draw conclusions about the relationship between smoking habits and general health based on sample evidence
  • Period: to

    Phase five

    Research Synthesis and Visualization
    -Collation of Insights
    -Devise Actionable Strategies
    -Present findings
  • Collation of insights

    Collation of insights
    Visualize key findings.
  • Devise actionable strategies

    Devise actionable strategies
    Implement strategies derived from analysis.
  • Present findings

    Present findings
    Utilize Tableau to develop a clear and concise report to visualize the analytical findings.