5-1 Assignment: Data Timeline

  • Late 2017: Data Migration to New Location

    An SQL database was approved for data storage, management, and analysis during the expansion. The IT system administrator physically transferred the data and the codebook to the new location using a flash drive and then promptly imported it into the new system.
    Data quality is compromised, and potential data loss occurs biasing any further analysis.
  • 2017: Initial Data Analysis

    The original analyst examined the 2017 data to determine whether a lack of medical care was an issue in Alabama and the reasons behind it.
    Results are potentially biased due to the data subset and the inherent response bias of the survey.
  • 2017: BRFSS Data Collection & Initial Dataset

    BRFSS (https://www.cdc.gov/brfss/index.html.) collected nationwide health data via phone surveys. An Alabama-specific subset of heart health risks was then extracted and saved as a CSV. Note: January 2018 data is an outlier/follow-up.
    Sampling and response biases inherent in BRFSS methodology and may impact reliability.
    Data Type: Numerical, Categorical, and Characters
    Initial Data Quality High
  • June 2018: Data Loss & Discrepancy Discovered

    A new data analyst was tasked with reviewing the master data set to identify the top three health concerns that the new facility should prioritize for heart health. During the data migration process, it was determined that the CSV file contained unreadable characters that the SQL database could not process. As a result, some rows of data failed to import correctly, leading to data loss.
    Severe data loss, reduced sample size, statistical power, and inability to reliably generalize findings.
  • Today: Help Wanted

    The data analyst found discrepancies in the import quantity and sought our assistance in identifying the missing information, analyzing the data's journey, and assessing the impact of data loss on quality. Potential ethical considerations, like HIPPA compliance, will apply.
    Note: Timeline dates are approximate due to scenario/data inconsistencies.