DAT 223 Module Five Assignment One

By huafist
  • Period: to

    Data Collected by BFRSS

    BRFSS conducted telephone surveys across the U.S. to collect data on health risk behaviors, chronic conditions, and use of preventative services.
  • Data Retrived from BFRSS

    A subset of the data for Alabama containing both categorical and numerical data was retrieved from the BFRSS database.
  • Initial Analysis Conducted

    Client's internal team performs initial analysis to determine if there is a lack of medical care in Alabama, and why.
  • Data Stored in locked room

    The data set was stored in CSV format on a secure hard drive, along with the codebook, and placed in a securely locked room. Though stored securely, ethical use requires careful management to avoid bias or misclassification in population-level analysis.
  • Period: to

    Data transferred

    Data set and codebook are copied to a flash drive, delivered to the new location, and immediately imported into the new SQL database.
  • Analyst Discovers Data Loss

    The analyst finds that the SQL database has fewer records than the original CSV file. The records that are missing might compromise the analysis's validity. For instance, missing values for heart attack (CVDINFR4) or Coronary Heart Disease (CVDCHR4) could skew conclusions about heart disease.
  • New Analyst Hired

    A new analyst hired in June 2018 is tasked with re-reviewing the master dataset to determine key health priorities for the new facility.
  • Migration Issues Occur During SQL Import

    Characters unreadable by the SQL database that exist in the original CSV Data set prevents some rows to fail during import, creating a partial data loss. No mention is made of any data validation process or error logs.