AI Revolution AtlasAsk Dr. Mira
Menu

Better evidence

Read data before you trust AI or dashboards

AI, spreadsheets, and charts depend on data. Learn how to ask basic questions before believing a number or visual.

13 minute readLast reviewed 2026-06-20

Plain-language summary

What this guide covers

Data literacy means understanding where data comes from, what it measures, what is missing, how it may be biased, and how numbers are summarized. It helps you judge spreadsheets, dashboards, surveys, charts, and AI outputs that depend on data.

Why it matters

Bad data can make a dashboard look precise, a chart look persuasive, or an AI output sound evidence-based. Data literacy gives beginners practical questions that reduce overconfidence.

What you will learn

  • Identify source, definition, and purpose questions for any dataset.
  • Check data quality using completeness, accuracy, consistency, timeliness, validity, and uniqueness.
  • Distinguish counts, averages, percentages, rates, and sample sizes.
  • Recognize missing data, bias, uncertainty, and misleading charts.
  • Use a claim-checking exercise before repeating a data claim.

Guide section

Start with the source

Before analyzing data, ask where it came from and what it was collected to do.

A dataset is not just numbers in rows and columns. It has a source, collection method, purpose, definitions, time period, and limits. A customer spreadsheet, school dashboard, public survey, and AI training dataset each answer different questions. If you do not know what a field means or how the data was collected, you should be cautious about conclusions.

Source questions

  • Who collected the data?
  • Why was it collected?
  • What time period does it cover?
  • What does each field mean?
  • Who or what is included?
  • Who or what might be missing?
  • What decision will this data support?

Guide section

Six practical quality checks

Data quality is not one thing. A dataset can be complete but still inaccurate, or current but inconsistent.

Quality checkQuestion to askSimple example
CompletenessAre needed records and fields present?A customer list has phone numbers for 94 of 100 customers.
AccuracyAre values correct?A complete address field still contains wrong street numbers.
ConsistencyDo values follow the same definitions and formats?One file uses NY and another uses New York.
TimelinessIs the data current enough for the decision?A price list from last year may not support today’s quote.
ValidityDo values fit the expected format or range?A month field should contain values from 1 to 12.
UniquenessAre duplicate records controlled?The same customer appears twice under slightly different names.

Example

Spreadsheet scenario

A small business exports a list of service calls. Before asking AI to summarize trends, the owner checks whether calls have dates, categories, complete notes, duplicate rows, and consistent labels. The owner finds that emergency calls are sometimes labeled urgent, ASAP, or high. The trend cannot be trusted until labels are cleaned or grouped with care.

Guide section

Metrics, averages, rates, and uncertainty

A number is easier to trust when you know what it counts and what it leaves out.

TermPlain meaningQuestion to ask
CountHow many items or people are included.Is the count complete, duplicated, or filtered?
AverageA single value used to summarize a group.Are there outliers that make the average misleading?
PercentageA part out of 100.What is the denominator?
RateA measure adjusted for exposure, population, or time.Rate per what: person, customer, hour, visit, or dollar?
Sample sizeHow many observations were used.Is the sample large and relevant enough for the claim?
UncertaintyThe range of possible error or doubt around a measurement.Does the source describe reliability, margin, or limitations?

Guide section

Missingness, bias, and charts

Data can mislead because of what is absent, not only because of what is present.

Missingness means some values or records are absent. Missing data can happen when people do not respond, fields are skipped, systems fail to capture information, or categories do not fit real situations. Bias means the data or method pushes results in a systematic direction. AI can amplify these problems when data is used for classification, prediction, or recommendations.

Chart check

  • Does the chart title clearly say what is measured?
  • Are axes labeled and scaled fairly?
  • Does the chart show counts, percentages, rates, or averages?
  • Is the time period clear?
  • Are important groups missing or combined?
  • Does the chart support the claim, or only suggest a question?
  • Can you reproduce the chart from the underlying data?

Example

Dashboard scenario

A school dashboard shows that attendance improved after a new reminder program. A data-literate reader asks whether the same students were tracked before and after, whether holidays changed the time period, whether remote learners were counted the same way, and whether the chart shows attendance rate or total attendance days.

Guide section

Reproducibility and claim checking

A data claim is stronger when someone else can understand how it was made.

Exercise: check a data claim

  1. Copy the exact claim.
  2. Underline the metric, time period, group, and comparison.
  3. Find the source data or source document.
  4. Check definitions and whether the denominator is clear.
  5. Look for missing data, small sample size, duplicates, or changed methods.
  6. Check whether a chart shows the same claim.
  7. Rewrite the claim with a caveat if needed.

Avoidable errors

Common mistakes and better approaches

Trusting a chart because it looks professional.

Better approach: Check the source, metric, denominator, time period, and axis scale.

Assuming complete data is accurate data.

Better approach: Check completeness and accuracy separately.

Repeating a percentage without naming the denominator.

Better approach: Say what the percentage is out of and whether the base is large enough.

Ignoring missing values.

Better approach: Ask why values are missing and whether missingness changes the conclusion.

Remember this

Key takeaways

  • Data literacy starts with source, purpose, definitions, and limits.
  • Quality includes completeness, accuracy, consistency, timeliness, validity, and uniqueness.
  • Averages, percentages, and rates answer different questions.
  • Sample size and uncertainty affect how strongly a claim can be made.
  • Charts can clarify or mislead.
  • Missing data and bias matter for AI outputs and dashboards.
  • Reproducibility means another person can understand how the result was made.

Questions readers ask

Frequently asked questions

Why do AI users need data literacy?

AI systems often rely on data, examples, documents, or patterns. If the data is incomplete, biased, outdated, or poorly defined, the output may look useful while carrying those problems forward.

What is the difference between missing data and biased data?

Missing data means some values or records are absent. Bias means the data or method systematically pushes results in a direction. Missing data can create bias when the missing records are not random.

Is a bigger sample always better?

A larger sample can improve reliability, but relevance, collection method, response patterns, and measurement quality still matter.

How can a beginner check a dashboard?

Start with the title, source, metric, denominator, time period, filters, missing groups, and whether the chart supports the claim being made.

Can AI clean my data for me?

AI may help spot possible blanks, duplicates, or inconsistent labels, but a person must decide what values mean, what can be changed, and how cleaning affects the result.

Sources and review notes

Sources were accessed on the dates shown. Links open the original organization’s page.

  1. SRC-01
    Artificial Intelligence Risk Management FrameworkNational Institute of Standards and Technology · Published 2023-01-26 · Accessed 2026-06-20
  2. SRC-02
    Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence ProfileNational Institute of Standards and Technology · Published 2024-07-26 · Accessed 2026-06-20
  3. SRC-06
    Forum Guide to Data LiteracyInstitute of Education Sciences and National Center for Education Statistics · Published 2024-07-01 · Accessed 2026-06-20
  4. SRC-07
    Sample Size and Data QualityU.S. Census Bureau · Accessed 2026-06-20
  5. SRC-08
    The Government Data Quality FrameworkGovernment Digital Service and Central Digital and Data Office · Published 2020-12-03 · Accessed 2026-06-20
  6. SRC-12
    Missing Data and Observational Data ModelingU.S. Census Bureau · Accessed 2026-06-20

Your next step

Use data skills in workflows

After you can question data and charts, learn how to map tasks, handoffs, bottlenecks, and human checkpoints.