• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

how to write analysis and interpretation of data in research

Home Market Research Research Tools and Apps

Data Interpretation: Definition and Steps with Examples

Data interpretation is the process of collecting data from one or more sources, analyzing it using appropriate methods, & drawing conclusions.

A good data interpretation process is key to making your data usable. It will help you make sure you’re drawing the correct conclusions and acting on your information.

No matter what, data is everywhere in the modern world. There are two groups and organizations: those drowning in data or not using it appropriately and those benefiting.

In this blog, you will learn the definition of data interpretation and its primary steps and examples.

What is Data Interpretation

Data interpretation is the process of reviewing data and arriving at relevant conclusions using various analytical research methods. Data analysis assists researchers in categorizing, manipulating data , and summarizing data to answer critical questions.

LEARN ABOUT: Level of Analysis

In business terms, the interpretation of data is the execution of various processes. This process analyzes and revises data to gain insights and recognize emerging patterns and behaviors. These conclusions will assist you as a manager in making an informed decision based on numbers while having all of the facts at your disposal.

Importance of Data Interpretation

Raw data is useless unless it’s interpreted. Data interpretation is important to businesses and people. The collected data helps make informed decisions.

Make better decisions

Any decision is based on the information that is available at the time. People used to think that many diseases were caused by bad blood, which was one of the four humors. So, the solution was to get rid of the bad blood. We now know that things like viruses, bacteria, and immune responses can cause illness and can act accordingly.

In the same way, when you know how to collect and understand data well, you can make better decisions. You can confidently choose a path for your organization or even your life instead of working with assumptions.

The most important thing is to follow a transparent process to reduce mistakes and tiredness when making decisions.

Find trends and take action

Another practical use of data interpretation is to get ahead of trends before they reach their peak. Some people have made a living by researching industries, spotting trends, and then making big bets on them.

LEARN ABOUT: Action Research

With the proper data interpretations and a little bit of work, you can catch the start of trends and use them to help your business or yourself grow. 

Better resource allocation

The last importance of data interpretation we will discuss is the ability to use people, tools, money, etc., more efficiently. For example, If you know via strong data interpretation that a market is underserved, you’ll go after it with more energy and win.

In the same way, you may find out that a market you thought was a good fit is actually bad. This could be because the market is too big for your products to serve, there is too much competition, or something else.

No matter what, you can move the resources you need faster and better to get better results.

What are the steps in interpreting data?

Here are some steps to interpreting data correctly.

Gather the data

The very first step in data interpretation is gathering all relevant data. You can do this by first visualizing it in a bar, graph, or pie chart. This step aims to analyze the data accurately and without bias. Now is the time to recall how you conducted your research.

Here are two question patterns that will help you to understand better.

  • Were there any flaws or changes that occurred during the data collection process?
  • Have you saved any observatory notes or indicators?

You can proceed to the next stage when you have all of your data.

  • Develop your discoveries

This is a summary of your findings. Here, you thoroughly examine the data to identify trends, patterns, or behavior. If you are researching a group of people using a sample population, this is the section where you examine behavioral patterns. You can compare these deductions to previous data sets, similar data sets, or general hypotheses in your industry. This step’s goal is to compare these deductions before drawing any conclusions.

  • Draw Conclusions

After you’ve developed your findings from your data sets, you can draw conclusions based on your discovered trends. Your findings should address the questions that prompted your research. If they do not respond, inquire about why; it may produce additional research or questions.

LEARN ABOUT: Research Process Steps

  • Give recommendations

The interpretation procedure of data comes to a close with this stage. Every research conclusion must include a recommendation. As recommendations are a summary of your findings and conclusions, they should be brief. There are only two options for recommendations; you can either recommend a course of action or suggest additional research.

Data interpretation examples

Here are two examples of data interpretations to help you understand it better:

Let’s say your users fall into four age groups. So a company can see which age group likes their content or product. Based on bar charts or pie charts, they can develop a marketing strategy to reach uninvolved groups or an outreach strategy to grow their core user base.

Another example of data analysis is the use of recruitment CRM by businesses. They utilize it to find candidates, track their progress, and manage their entire hiring process to determine how they can better automate their workflow.

Overall, data interpretation is an essential factor in data-driven decision-making. It should be performed on a regular basis as part of an iterative interpretation process. Investors, developers, and sales and acquisition professionals can benefit from routine data interpretation. It is what you do with those insights that determine the success of your business.

Contact QuestionPro experts if you need assistance conducting research or creating a data analysis. We can walk you through the process and help you make the most of your data.

MORE LIKE THIS

We are on the front end of an innovation that can help us better predict how to transform our customer interactions.

How Can I Help You? — Tuesday CX Thoughts

Jun 5, 2024

how to write analysis and interpretation of data in research

Why Multilingual 360 Feedback Surveys Provide Better Insights

Jun 3, 2024

Raked Weighting

Raked Weighting: A Key Tool for Accurate Survey Results

May 31, 2024

Data trends

Top 8 Data Trends to Understand the Future of Data

May 30, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence
  • Privacy Policy

Research Method

Home » Data Interpretation – Process, Methods and Questions

Data Interpretation – Process, Methods and Questions

Table of Contents

Data Interpretation

Data Interpretation

Definition :

Data interpretation refers to the process of making sense of data by analyzing and drawing conclusions from it. It involves examining data in order to identify patterns, relationships, and trends that can help explain the underlying phenomena being studied. Data interpretation can be used to make informed decisions and solve problems across a wide range of fields, including business, science, and social sciences.

Data Interpretation Process

Here are the steps involved in the data interpretation process:

  • Define the research question : The first step in data interpretation is to clearly define the research question. This will help you to focus your analysis and ensure that you are interpreting the data in a way that is relevant to your research objectives.
  • Collect the data: The next step is to collect the data. This can be done through a variety of methods such as surveys, interviews, observation, or secondary data sources.
  • Clean and organize the data : Once the data has been collected, it is important to clean and organize it. This involves checking for errors, inconsistencies, and missing data. Data cleaning can be a time-consuming process, but it is essential to ensure that the data is accurate and reliable.
  • Analyze the data: The next step is to analyze the data. This can involve using statistical software or other tools to calculate summary statistics, create graphs and charts, and identify patterns in the data.
  • Interpret the results: Once the data has been analyzed, it is important to interpret the results. This involves looking for patterns, trends, and relationships in the data. It also involves drawing conclusions based on the results of the analysis.
  • Communicate the findings : The final step is to communicate the findings. This can involve creating reports, presentations, or visualizations that summarize the key findings of the analysis. It is important to communicate the findings in a way that is clear and concise, and that is tailored to the audience’s needs.

Types of Data Interpretation

There are various types of data interpretation techniques used for analyzing and making sense of data. Here are some of the most common types:

Descriptive Interpretation

This type of interpretation involves summarizing and describing the key features of the data. This can involve calculating measures of central tendency (such as mean, median, and mode), measures of dispersion (such as range, variance, and standard deviation), and creating visualizations such as histograms, box plots, and scatterplots.

Inferential Interpretation

This type of interpretation involves making inferences about a larger population based on a sample of the data. This can involve hypothesis testing, where you test a hypothesis about a population parameter using sample data, or confidence interval estimation, where you estimate a range of values for a population parameter based on sample data.

Predictive Interpretation

This type of interpretation involves using data to make predictions about future outcomes. This can involve building predictive models using statistical techniques such as regression analysis, time-series analysis, or machine learning algorithms.

Exploratory Interpretation

This type of interpretation involves exploring the data to identify patterns and relationships that were not previously known. This can involve data mining techniques such as clustering analysis, principal component analysis, or association rule mining.

Causal Interpretation

This type of interpretation involves identifying causal relationships between variables in the data. This can involve experimental designs, such as randomized controlled trials, or observational studies, such as regression analysis or propensity score matching.

Data Interpretation Methods

There are various methods for data interpretation that can be used to analyze and make sense of data. Here are some of the most common methods:

Statistical Analysis

This method involves using statistical techniques to analyze the data. Statistical analysis can involve descriptive statistics (such as measures of central tendency and dispersion), inferential statistics (such as hypothesis testing and confidence interval estimation), and predictive modeling (such as regression analysis and time-series analysis).

Data Visualization

This method involves using visual representations of the data to identify patterns and trends. Data visualization can involve creating charts, graphs, and other visualizations, such as heat maps or scatterplots.

Text Analysis

This method involves analyzing text data, such as survey responses or social media posts, to identify patterns and themes. Text analysis can involve techniques such as sentiment analysis, topic modeling, and natural language processing.

Machine Learning

This method involves using algorithms to identify patterns in the data and make predictions or classifications. Machine learning can involve techniques such as decision trees, neural networks, and random forests.

Qualitative Analysis

This method involves analyzing non-numeric data, such as interviews or focus group discussions, to identify themes and patterns. Qualitative analysis can involve techniques such as content analysis, grounded theory, and narrative analysis.

Geospatial Analysis

This method involves analyzing spatial data, such as maps or GPS coordinates, to identify patterns and relationships. Geospatial analysis can involve techniques such as spatial autocorrelation, hot spot analysis, and clustering.

Applications of Data Interpretation

Data interpretation has a wide range of applications across different fields, including business, healthcare, education, social sciences, and more. Here are some examples of how data interpretation is used in different applications:

  • Business : Data interpretation is widely used in business to inform decision-making, identify market trends, and optimize operations. For example, businesses may analyze sales data to identify the most popular products or customer demographics, or use predictive modeling to forecast demand and adjust pricing accordingly.
  • Healthcare : Data interpretation is critical in healthcare for identifying disease patterns, evaluating treatment effectiveness, and improving patient outcomes. For example, healthcare providers may use electronic health records to analyze patient data and identify risk factors for certain diseases or conditions.
  • Education : Data interpretation is used in education to assess student performance, identify areas for improvement, and evaluate the effectiveness of instructional methods. For example, schools may analyze test scores to identify students who are struggling and provide targeted interventions to improve their performance.
  • Social sciences : Data interpretation is used in social sciences to understand human behavior, attitudes, and perceptions. For example, researchers may analyze survey data to identify patterns in public opinion or use qualitative analysis to understand the experiences of marginalized communities.
  • Sports : Data interpretation is increasingly used in sports to inform strategy and improve performance. For example, coaches may analyze performance data to identify areas for improvement or use predictive modeling to assess the likelihood of injuries or other risks.

When to use Data Interpretation

Data interpretation is used to make sense of complex data and to draw conclusions from it. It is particularly useful when working with large datasets or when trying to identify patterns or trends in the data. Data interpretation can be used in a variety of settings, including scientific research, business analysis, and public policy.

In scientific research, data interpretation is often used to draw conclusions from experiments or studies. Researchers use statistical analysis and data visualization techniques to interpret their data and to identify patterns or relationships between variables. This can help them to understand the underlying mechanisms of their research and to develop new hypotheses.

In business analysis, data interpretation is used to analyze market trends and consumer behavior. Companies can use data interpretation to identify patterns in customer buying habits, to understand market trends, and to develop marketing strategies that target specific customer segments.

In public policy, data interpretation is used to inform decision-making and to evaluate the effectiveness of policies and programs. Governments and other organizations use data interpretation to track the impact of policies and programs over time, to identify areas where improvements are needed, and to develop evidence-based policy recommendations.

In general, data interpretation is useful whenever large amounts of data need to be analyzed and understood in order to make informed decisions.

Data Interpretation Examples

Here are some real-time examples of data interpretation:

  • Social media analytics : Social media platforms generate vast amounts of data every second, and businesses can use this data to analyze customer behavior, track sentiment, and identify trends. Data interpretation in social media analytics involves analyzing data in real-time to identify patterns and trends that can help businesses make informed decisions about marketing strategies and customer engagement.
  • Healthcare analytics: Healthcare organizations use data interpretation to analyze patient data, track outcomes, and identify areas where improvements are needed. Real-time data interpretation can help healthcare providers make quick decisions about patient care, such as identifying patients who are at risk of developing complications or adverse events.
  • Financial analysis: Real-time data interpretation is essential for financial analysis, where traders and analysts need to make quick decisions based on changing market conditions. Financial analysts use data interpretation to track market trends, identify opportunities for investment, and develop trading strategies.
  • Environmental monitoring : Real-time data interpretation is important for environmental monitoring, where data is collected from various sources such as satellites, sensors, and weather stations. Data interpretation helps to identify patterns and trends that can help predict natural disasters, track changes in the environment, and inform decision-making about environmental policies.
  • Traffic management: Real-time data interpretation is used for traffic management, where traffic sensors collect data on traffic flow, congestion, and accidents. Data interpretation helps to identify areas where traffic congestion is high, and helps traffic management authorities make decisions about road maintenance, traffic signal timing, and other strategies to improve traffic flow.

Data Interpretation Questions

Data Interpretation Questions samples:

  • Medical : What is the correlation between a patient’s age and their risk of developing a certain disease?
  • Environmental Science: What is the trend in the concentration of a certain pollutant in a particular body of water over the past 10 years?
  • Finance : What is the correlation between a company’s stock price and its quarterly revenue?
  • Education : What is the trend in graduation rates for a particular high school over the past 5 years?
  • Marketing : What is the correlation between a company’s advertising budget and its sales revenue?
  • Sports : What is the trend in the number of home runs hit by a particular baseball player over the past 3 seasons?
  • Social Science: What is the correlation between a person’s level of education and their income level?

In order to answer these questions, you would need to analyze and interpret the data using statistical methods, graphs, and other visualization tools.

Purpose of Data Interpretation

The purpose of data interpretation is to make sense of complex data by analyzing and drawing insights from it. The process of data interpretation involves identifying patterns and trends, making comparisons, and drawing conclusions based on the data. The ultimate goal of data interpretation is to use the insights gained from the analysis to inform decision-making.

Data interpretation is important because it allows individuals and organizations to:

  • Understand complex data : Data interpretation helps individuals and organizations to make sense of complex data sets that would otherwise be difficult to understand.
  • Identify patterns and trends : Data interpretation helps to identify patterns and trends in data, which can reveal important insights about the underlying processes and relationships.
  • Make informed decisions: Data interpretation provides individuals and organizations with the information they need to make informed decisions based on the insights gained from the data analysis.
  • Evaluate performance : Data interpretation helps individuals and organizations to evaluate their performance over time and to identify areas where improvements can be made.
  • Communicate findings: Data interpretation allows individuals and organizations to communicate their findings to others in a clear and concise manner, which is essential for informing stakeholders and making changes based on the insights gained from the analysis.

Characteristics of Data Interpretation

Here are some characteristics of data interpretation:

  • Contextual : Data interpretation is always contextual, meaning that the interpretation of data is dependent on the context in which it is analyzed. The same data may have different meanings depending on the context in which it is analyzed.
  • Iterative : Data interpretation is an iterative process, meaning that it often involves multiple rounds of analysis and refinement as more data becomes available or as new insights are gained from the analysis.
  • Subjective : Data interpretation is often subjective, as it involves the interpretation of data by individuals who may have different perspectives and biases. It is important to acknowledge and address these biases when interpreting data.
  • Analytical : Data interpretation involves the use of analytical tools and techniques to analyze and draw insights from data. These may include statistical analysis, data visualization, and other data analysis methods.
  • Evidence-based : Data interpretation is evidence-based, meaning that it is based on the data and the insights gained from the analysis. It is important to ensure that the data used in the analysis is accurate, relevant, and reliable.
  • Actionable : Data interpretation is actionable, meaning that it provides insights that can be used to inform decision-making and to drive action. The ultimate goal of data interpretation is to use the insights gained from the analysis to improve performance or to achieve specific goals.

Advantages of Data Interpretation

Data interpretation has several advantages, including:

  • Improved decision-making: Data interpretation provides insights that can be used to inform decision-making. By analyzing data and drawing insights from it, individuals and organizations can make informed decisions based on evidence rather than intuition.
  • Identification of patterns and trends: Data interpretation helps to identify patterns and trends in data, which can reveal important insights about the underlying processes and relationships. This information can be used to improve performance or to achieve specific goals.
  • Evaluation of performance: Data interpretation helps individuals and organizations to evaluate their performance over time and to identify areas where improvements can be made. By analyzing data, organizations can identify strengths and weaknesses and make changes to improve their performance.
  • Communication of findings: Data interpretation allows individuals and organizations to communicate their findings to others in a clear and concise manner, which is essential for informing stakeholders and making changes based on the insights gained from the analysis.
  • Better resource allocation: Data interpretation can help organizations allocate resources more efficiently by identifying areas where resources are needed most. By analyzing data, organizations can identify areas where resources are being underutilized or where additional resources are needed to improve performance.
  • Improved competitiveness : Data interpretation can give organizations a competitive advantage by providing insights that help to improve performance, reduce costs, or identify new opportunities for growth.

Limitations of Data Interpretation

Data interpretation has some limitations, including:

  • Limited by the quality of data: The quality of data used in data interpretation can greatly impact the accuracy of the insights gained from the analysis. Poor quality data can lead to incorrect conclusions and decisions.
  • Subjectivity: Data interpretation can be subjective, as it involves the interpretation of data by individuals who may have different perspectives and biases. This can lead to different interpretations of the same data.
  • Limited by analytical tools: The analytical tools and techniques used in data interpretation can also limit the accuracy of the insights gained from the analysis. Different analytical tools may yield different results, and some tools may not be suitable for certain types of data.
  • Time-consuming: Data interpretation can be a time-consuming process, particularly for large and complex data sets. This can make it difficult to quickly make decisions based on the insights gained from the analysis.
  • Incomplete data: Data interpretation can be limited by incomplete data sets, which may not provide a complete picture of the situation being analyzed. Incomplete data can lead to incorrect conclusions and decisions.
  • Limited by context: Data interpretation is always contextual, meaning that the interpretation of data is dependent on the context in which it is analyzed. The same data may have different meanings depending on the context in which it is analyzed.

Difference between Data Interpretation and Data Analysis

Data interpretation and data analysis are two different but closely related processes in data-driven decision-making.

Data analysis refers to the process of examining and examining data using statistical and computational methods to derive insights and conclusions from it. It involves cleaning, transforming, and modeling the data to uncover patterns, relationships, and trends that can help in understanding the underlying phenomena.

Data interpretation, on the other hand, refers to the process of making sense of the findings from the data analysis by contextualizing them within the larger problem domain. It involves identifying the key takeaways from the data analysis, assessing their relevance and significance to the problem at hand, and communicating the insights in a clear and actionable manner.

In short, data analysis is about uncovering insights from the data, while data interpretation is about making sense of those insights and translating them into actionable recommendations.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Dissertation vs Thesis

Dissertation vs Thesis – Key Differences

Research Project

Research Project – Definition, Writing Guide and...

Thesis

Thesis – Structure, Example and Writing Guide

Implications in Research

Implications in Research – Types, Examples and...

Chapter Summary

Chapter Summary & Overview – Writing Guide...

Research Objectives

Research Objectives – Types, Examples and...

A Guide To The Methods, Benefits & Problems of The Interpretation of Data

Data interpretation blog post by datapine

Table of Contents

1) What Is Data Interpretation?

2) How To Interpret Data?

3) Why Data Interpretation Is Important?

4) Data Interpretation Skills

5) Data Analysis & Interpretation Problems

6) Data Interpretation Techniques & Methods

7) The Use of Dashboards For Data Interpretation

8) Business Data Interpretation Examples

Data analysis and interpretation have now taken center stage with the advent of the digital age… and the sheer amount of data can be frightening. In fact, a Digital Universe study found that the total data supply in 2012 was 2.8 trillion gigabytes! Based on that amount of data alone, it is clear the calling card of any successful enterprise in today’s global world will be the ability to analyze complex data, produce actionable insights, and adapt to new market needs… all at the speed of thought.

Business dashboards are the digital age tools for big data. Capable of displaying key performance indicators (KPIs) for both quantitative and qualitative data analyses, they are ideal for making the fast-paced and data-driven market decisions that push today’s industry leaders to sustainable success. Through the art of streamlined visual communication, data dashboards permit businesses to engage in real-time and informed decision-making and are key instruments in data interpretation. First of all, let’s find a definition to understand what lies behind this practice.

What Is Data Interpretation?

Data interpretation refers to the process of using diverse analytical methods to review data and arrive at relevant conclusions. The interpretation of data helps researchers to categorize, manipulate, and summarize the information in order to answer critical questions.

The importance of data interpretation is evident, and this is why it needs to be done properly. Data is very likely to arrive from multiple sources and has a tendency to enter the analysis process with haphazard ordering. Data analysis tends to be extremely subjective. That is to say, the nature and goal of interpretation will vary from business to business, likely correlating to the type of data being analyzed. While there are several types of processes that are implemented based on the nature of individual data, the two broadest and most common categories are “quantitative and qualitative analysis.”

Yet, before any serious data interpretation inquiry can begin, it should be understood that visual presentations of data findings are irrelevant unless a sound decision is made regarding measurement scales. Before any serious data analysis can begin, the measurement scale must be decided for the data as this will have a long-term impact on data interpretation ROI. The varying scales include:

  • Nominal Scale: non-numeric categories that cannot be ranked or compared quantitatively. Variables are exclusive and exhaustive.
  • Ordinal Scale: exclusive categories that are exclusive and exhaustive but with a logical order. Quality ratings and agreement ratings are examples of ordinal scales (i.e., good, very good, fair, etc., OR agree, strongly agree, disagree, etc.).
  • Interval: a measurement scale where data is grouped into categories with orderly and equal distances between the categories. There is always an arbitrary zero point.
  • Ratio: contains features of all three.

For a more in-depth review of scales of measurement, read our article on data analysis questions . Once measurement scales have been selected, it is time to select which of the two broad interpretation processes will best suit your data needs. Let’s take a closer look at those specific methods and possible data interpretation problems.

How To Interpret Data? Top Methods & Techniques

Illustration of data interpretation on blackboard

When interpreting data, an analyst must try to discern the differences between correlation, causation, and coincidences, as well as many other biases – but he also has to consider all the factors involved that may have led to a result. There are various data interpretation types and methods one can use to achieve this.

The interpretation of data is designed to help people make sense of numerical data that has been collected, analyzed, and presented. Having a baseline method for interpreting data will provide your analyst teams with a structure and consistent foundation. Indeed, if several departments have different approaches to interpreting the same data while sharing the same goals, some mismatched objectives can result. Disparate methods will lead to duplicated efforts, inconsistent solutions, wasted energy, and inevitably – time and money. In this part, we will look at the two main methods of interpretation of data: qualitative and quantitative analysis.

Qualitative Data Interpretation

Qualitative data analysis can be summed up in one word – categorical. With this type of analysis, data is not described through numerical values or patterns but through the use of descriptive context (i.e., text). Typically, narrative data is gathered by employing a wide variety of person-to-person techniques. These techniques include:

  • Observations: detailing behavioral patterns that occur within an observation group. These patterns could be the amount of time spent in an activity, the type of activity, and the method of communication employed.
  • Focus groups: Group people and ask them relevant questions to generate a collaborative discussion about a research topic.
  • Secondary Research: much like how patterns of behavior can be observed, various types of documentation resources can be coded and divided based on the type of material they contain.
  • Interviews: one of the best collection methods for narrative data. Inquiry responses can be grouped by theme, topic, or category. The interview approach allows for highly focused data segmentation.

A key difference between qualitative and quantitative analysis is clearly noticeable in the interpretation stage. The first one is widely open to interpretation and must be “coded” so as to facilitate the grouping and labeling of data into identifiable themes. As person-to-person data collection techniques can often result in disputes pertaining to proper analysis, qualitative data analysis is often summarized through three basic principles: notice things, collect things, and think about things.

After qualitative data has been collected through transcripts, questionnaires, audio and video recordings, or the researcher’s notes, it is time to interpret it. For that purpose, there are some common methods used by researchers and analysts.

  • Content analysis : As its name suggests, this is a research method used to identify frequencies and recurring words, subjects, and concepts in image, video, or audio content. It transforms qualitative information into quantitative data to help discover trends and conclusions that will later support important research or business decisions. This method is often used by marketers to understand brand sentiment from the mouths of customers themselves. Through that, they can extract valuable information to improve their products and services. It is recommended to use content analytics tools for this method as manually performing it is very time-consuming and can lead to human error or subjectivity issues. Having a clear goal in mind before diving into it is another great practice for avoiding getting lost in the fog.  
  • Thematic analysis: This method focuses on analyzing qualitative data, such as interview transcripts, survey questions, and others, to identify common patterns and separate the data into different groups according to found similarities or themes. For example, imagine you want to analyze what customers think about your restaurant. For this purpose, you do a thematic analysis on 1000 reviews and find common themes such as “fresh food”, “cold food”, “small portions”, “friendly staff”, etc. With those recurring themes in hand, you can extract conclusions about what could be improved or enhanced based on your customer’s experiences. Since this technique is more exploratory, be open to changing your research questions or goals as you go. 
  • Narrative analysis: A bit more specific and complicated than the two previous methods, it is used to analyze stories and discover their meaning. These stories can be extracted from testimonials, case studies, and interviews, as these formats give people more space to tell their experiences. Given that collecting this kind of data is harder and more time-consuming, sample sizes for narrative analysis are usually smaller, which makes it harder to reproduce its findings. However, it is still a valuable technique for understanding customers' preferences and mindsets.  
  • Discourse analysis : This method is used to draw the meaning of any type of visual, written, or symbolic language in relation to a social, political, cultural, or historical context. It is used to understand how context can affect how language is carried out and understood. For example, if you are doing research on power dynamics, using discourse analysis to analyze a conversation between a janitor and a CEO and draw conclusions about their responses based on the context and your research questions is a great use case for this technique. That said, like all methods in this section, discourse analytics is time-consuming as the data needs to be analyzed until no new insights emerge.  
  • Grounded theory analysis : The grounded theory approach aims to create or discover a new theory by carefully testing and evaluating the data available. Unlike all other qualitative approaches on this list, grounded theory helps extract conclusions and hypotheses from the data instead of going into the analysis with a defined hypothesis. This method is very popular amongst researchers, analysts, and marketers as the results are completely data-backed, providing a factual explanation of any scenario. It is often used when researching a completely new topic or with little knowledge as this space to start from the ground up. 

Quantitative Data Interpretation

If quantitative data interpretation could be summed up in one word (and it really can’t), that word would be “numerical.” There are few certainties when it comes to data analysis, but you can be sure that if the research you are engaging in has no numbers involved, it is not quantitative research, as this analysis refers to a set of processes by which numerical data is analyzed. More often than not, it involves the use of statistical modeling such as standard deviation, mean, and median. Let’s quickly review the most common statistical terms:

  • Mean: A mean represents a numerical average for a set of responses. When dealing with a data set (or multiple data sets), a mean will represent the central value of a specific set of numbers. It is the sum of the values divided by the number of values within the data set. Other terms that can be used to describe the concept are arithmetic mean, average, and mathematical expectation.
  • Standard deviation: This is another statistical term commonly used in quantitative analysis. Standard deviation reveals the distribution of the responses around the mean. It describes the degree of consistency within the responses; together with the mean, it provides insight into data sets.
  • Frequency distribution: This is a measurement gauging the rate of a response appearance within a data set. When using a survey, for example, frequency distribution, it can determine the number of times a specific ordinal scale response appears (i.e., agree, strongly agree, disagree, etc.). Frequency distribution is extremely keen in determining the degree of consensus among data points.

Typically, quantitative data is measured by visually presenting correlation tests between two or more variables of significance. Different processes can be used together or separately, and comparisons can be made to ultimately arrive at a conclusion. Other signature interpretation processes of quantitative data include:

  • Regression analysis: Essentially, it uses historical data to understand the relationship between a dependent variable and one or more independent variables. Knowing which variables are related and how they developed in the past allows you to anticipate possible outcomes and make better decisions going forward. For example, if you want to predict your sales for next month, you can use regression to understand what factors will affect them, such as products on sale and the launch of a new campaign, among many others. 
  • Cohort analysis: This method identifies groups of users who share common characteristics during a particular time period. In a business scenario, cohort analysis is commonly used to understand customer behaviors. For example, a cohort could be all users who have signed up for a free trial on a given day. An analysis would be carried out to see how these users behave, what actions they carry out, and how their behavior differs from other user groups.
  • Predictive analysis: As its name suggests, the predictive method aims to predict future developments by analyzing historical and current data. Powered by technologies such as artificial intelligence and machine learning, predictive analytics practices enable businesses to identify patterns or potential issues and plan informed strategies in advance.
  • Prescriptive analysis: Also powered by predictions, the prescriptive method uses techniques such as graph analysis, complex event processing, and neural networks, among others, to try to unravel the effect that future decisions will have in order to adjust them before they are actually made. This helps businesses to develop responsive, practical business strategies.
  • Conjoint analysis: Typically applied to survey analysis, the conjoint approach is used to analyze how individuals value different attributes of a product or service. This helps researchers and businesses to define pricing, product features, packaging, and many other attributes. A common use is menu-based conjoint analysis, in which individuals are given a “menu” of options from which they can build their ideal concept or product. Through this, analysts can understand which attributes they would pick above others and drive conclusions.
  • Cluster analysis: Last but not least, the cluster is a method used to group objects into categories. Since there is no target variable when using cluster analysis, it is a useful method to find hidden trends and patterns in the data. In a business context, clustering is used for audience segmentation to create targeted experiences. In market research, it is often used to identify age groups, geographical information, and earnings, among others.

Now that we have seen how to interpret data, let's move on and ask ourselves some questions: What are some of the benefits of data interpretation? Why do all industries engage in data research and analysis? These are basic questions, but they often don’t receive adequate attention.

Your Chance: Want to test a powerful data analysis software? Use our 14-days free trial & start extracting insights from your data!

Why Data Interpretation Is Important

illustrating quantitative data interpretation with charts & graphs

The purpose of collection and interpretation is to acquire useful and usable information and to make the most informed decisions possible. From businesses to newlyweds researching their first home, data collection and interpretation provide limitless benefits for a wide range of institutions and individuals.

Data analysis and interpretation, regardless of the method and qualitative/quantitative status, may include the following characteristics:

  • Data identification and explanation
  • Comparing and contrasting data
  • Identification of data outliers
  • Future predictions

Data analysis and interpretation, in the end, help improve processes and identify problems. It is difficult to grow and make dependable improvements without, at the very least, minimal data collection and interpretation. What is the keyword? Dependable. Vague ideas regarding performance enhancement exist within all institutions and industries. Yet, without proper research and analysis, an idea is likely to remain in a stagnant state forever (i.e., minimal growth). So… what are a few of the business benefits of digital age data analysis and interpretation? Let’s take a look!

1) Informed decision-making: A decision is only as good as the knowledge that formed it. Informed data decision-making can potentially set industry leaders apart from the rest of the market pack. Studies have shown that companies in the top third of their industries are, on average, 5% more productive and 6% more profitable when implementing informed data decision-making processes. Most decisive actions will arise only after a problem has been identified or a goal defined. Data analysis should include identification, thesis development, and data collection, followed by data communication.

If institutions only follow that simple order, one that we should all be familiar with from grade school science fairs, then they will be able to solve issues as they emerge in real-time. Informed decision-making has a tendency to be cyclical. This means there is really no end, and eventually, new questions and conditions arise within the process that need to be studied further. The monitoring of data results will inevitably return the process to the start with new data and sights.

2) Anticipating needs with trends identification: data insights provide knowledge, and knowledge is power. The insights obtained from market and consumer data analyses have the ability to set trends for peers within similar market segments. A perfect example of how data analytics can impact trend prediction is evidenced in the music identification application Shazam . The application allows users to upload an audio clip of a song they like but can’t seem to identify. Users make 15 million song identifications a day. With this data, Shazam has been instrumental in predicting future popular artists.

When industry trends are identified, they can then serve a greater industry purpose. For example, the insights from Shazam’s monitoring benefits not only Shazam in understanding how to meet consumer needs but also grant music executives and record label companies an insight into the pop-culture scene of the day. Data gathering and interpretation processes can allow for industry-wide climate prediction and result in greater revenue streams across the market. For this reason, all institutions should follow the basic data cycle of collection, interpretation, decision-making, and monitoring.

3) Cost efficiency: Proper implementation of analytics processes can provide businesses with profound cost advantages within their industries. A recent data study performed by Deloitte vividly demonstrates this in finding that data analysis ROI is driven by efficient cost reductions. Often, this benefit is overlooked because making money is typically viewed as “sexier” than saving money. Yet, sound data analyses have the ability to alert management to cost-reduction opportunities without any significant exertion of effort on the part of human capital.

A great example of the potential for cost efficiency through data analysis is Intel. Prior to 2012, Intel would conduct over 19,000 manufacturing function tests on their chips before they could be deemed acceptable for release. To cut costs and reduce test time, Intel implemented predictive data analyses. By using historical and current data, Intel now avoids testing each chip 19,000 times by focusing on specific and individual chip tests. After its implementation in 2012, Intel saved over $3 million in manufacturing costs. Cost reduction may not be as “sexy” as data profit, but as Intel proves, it is a benefit of data analysis that should not be neglected.

4) Clear foresight: companies that collect and analyze their data gain better knowledge about themselves, their processes, and their performance. They can identify performance challenges when they arise and take action to overcome them. Data interpretation through visual representations lets them process their findings faster and make better-informed decisions on the company's future.

Key Data Interpretation Skills You Should Have

Just like any other process, data interpretation and analysis require researchers or analysts to have some key skills to be able to perform successfully. It is not enough just to apply some methods and tools to the data; the person who is managing it needs to be objective and have a data-driven mind, among other skills. 

It is a common misconception to think that the required skills are mostly number-related. While data interpretation is heavily analytically driven, it also requires communication and narrative skills, as the results of the analysis need to be presented in a way that is easy to understand for all types of audiences. 

Luckily, with the rise of self-service tools and AI-driven technologies, data interpretation is no longer segregated for analysts only. However, the topic still remains a big challenge for businesses that make big investments in data and tools to support it, as the interpretation skills required are still lacking. It is worthless to put massive amounts of money into extracting information if you are not going to be able to interpret what that information is telling you. For that reason, below we list the top 5 data interpretation skills your employees or researchers should have to extract the maximum potential from the data. 

  • Data Literacy: The first and most important skill to have is data literacy. This means having the ability to understand, work, and communicate with data. It involves knowing the types of data sources, methods, and ethical implications of using them. In research, this skill is often a given. However, in a business context, there might be many employees who are not comfortable with data. The issue is the interpretation of data can not be solely responsible for the data team, as it is not sustainable in the long run. Experts advise business leaders to carefully assess the literacy level across their workforce and implement training instances to ensure everyone can interpret their data. 
  • Data Tools: The data interpretation and analysis process involves using various tools to collect, clean, store, and analyze the data. The complexity of the tools varies depending on the type of data and the analysis goals. Going from simple ones like Excel to more complex ones like databases, such as SQL, or programming languages, such as R or Python. It also involves visual analytics tools to bring the data to life through the use of graphs and charts. Managing these tools is a fundamental skill as they make the process faster and more efficient. As mentioned before, most modern solutions are now self-service, enabling less technical users to use them without problem.
  • Critical Thinking: Another very important skill is to have critical thinking. Data hides a range of conclusions, trends, and patterns that must be discovered. It is not just about comparing numbers; it is about putting a story together based on multiple factors that will lead to a conclusion. Therefore, having the ability to look further from what is right in front of you is an invaluable skill for data interpretation. 
  • Data Ethics: In the information age, being aware of the legal and ethical responsibilities that come with the use of data is of utmost importance. In short, data ethics involves respecting the privacy and confidentiality of data subjects, as well as ensuring accuracy and transparency for data usage. It requires the analyzer or researcher to be completely objective with its interpretation to avoid any biases or discrimination. Many countries have already implemented regulations regarding the use of data, including the GDPR or the ACM Code Of Ethics. Awareness of these regulations and responsibilities is a fundamental skill that anyone working in data interpretation should have. 
  • Domain Knowledge: Another skill that is considered important when interpreting data is to have domain knowledge. As mentioned before, data hides valuable insights that need to be uncovered. To do so, the analyst needs to know about the industry or domain from which the information is coming and use that knowledge to explore it and put it into a broader context. This is especially valuable in a business context, where most departments are now analyzing data independently with the help of a live dashboard instead of relying on the IT department, which can often overlook some aspects due to a lack of expertise in the topic. 

Common Data Analysis And Interpretation Problems

Man running away from common data interpretation problems

The oft-repeated mantra of those who fear data advancements in the digital age is “big data equals big trouble.” While that statement is not accurate, it is safe to say that certain data interpretation problems or “pitfalls” exist and can occur when analyzing data, especially at the speed of thought. Let’s identify some of the most common data misinterpretation risks and shed some light on how they can be avoided:

1) Correlation mistaken for causation: our first misinterpretation of data refers to the tendency of data analysts to mix the cause of a phenomenon with correlation. It is the assumption that because two actions occurred together, one caused the other. This is inaccurate, as actions can occur together, absent a cause-and-effect relationship.

  • Digital age example: assuming that increased revenue results from increased social media followers… there might be a definitive correlation between the two, especially with today’s multi-channel purchasing experiences. But that does not mean an increase in followers is the direct cause of increased revenue. There could be both a common cause and an indirect causality.
  • Remedy: attempt to eliminate the variable you believe to be causing the phenomenon.

2) Confirmation bias: our second problem is data interpretation bias. It occurs when you have a theory or hypothesis in mind but are intent on only discovering data patterns that support it while rejecting those that do not.

  • Digital age example: your boss asks you to analyze the success of a recent multi-platform social media marketing campaign. While analyzing the potential data variables from the campaign (one that you ran and believe performed well), you see that the share rate for Facebook posts was great, while the share rate for Twitter Tweets was not. Using only Facebook posts to prove your hypothesis that the campaign was successful would be a perfect manifestation of confirmation bias.
  • Remedy: as this pitfall is often based on subjective desires, one remedy would be to analyze data with a team of objective individuals. If this is not possible, another solution is to resist the urge to make a conclusion before data exploration has been completed. Remember to always try to disprove a hypothesis, not prove it.

3) Irrelevant data: the third data misinterpretation pitfall is especially important in the digital age. As large data is no longer centrally stored and as it continues to be analyzed at the speed of thought, it is inevitable that analysts will focus on data that is irrelevant to the problem they are trying to correct.

  • Digital age example: in attempting to gauge the success of an email lead generation campaign, you notice that the number of homepage views directly resulting from the campaign increased, but the number of monthly newsletter subscribers did not. Based on the number of homepage views, you decide the campaign was a success when really it generated zero leads.
  • Remedy: proactively and clearly frame any data analysis variables and KPIs prior to engaging in a data review. If the metric you use to measure the success of a lead generation campaign is newsletter subscribers, there is no need to review the number of homepage visits. Be sure to focus on the data variable that answers your question or solves your problem and not on irrelevant data.

4) Truncating an Axes: When creating a graph to start interpreting the results of your analysis, it is important to keep the axes truthful and avoid generating misleading visualizations. Starting the axes in a value that doesn’t portray the actual truth about the data can lead to false conclusions. 

  • Digital age example: In the image below, we can see a graph from Fox News in which the Y-axes start at 34%, making it seem that the difference between 35% and 39.6% is way higher than it actually is. This could lead to a misinterpretation of the tax rate changes. 

Fox news graph truncating an axes

* Source : www.venngage.com *

  • Remedy: Be careful with how your data is visualized. Be respectful and realistic with axes to avoid misinterpretation of your data. See below how the Fox News chart looks when using the correct axis values. This chart was created with datapine's modern online data visualization tool.

Fox news graph with the correct axes values

5) (Small) sample size: Another common problem is using a small sample size. Logically, the bigger the sample size, the more accurate and reliable the results. However, this also depends on the size of the effect of the study. For example, the sample size in a survey about the quality of education will not be the same as for one about people doing outdoor sports in a specific area. 

  • Digital age example: Imagine you ask 30 people a question, and 29 answer “yes,” resulting in 95% of the total. Now imagine you ask the same question to 1000, and 950 of them answer “yes,” which is again 95%. While these percentages might look the same, they certainly do not mean the same thing, as a 30-person sample size is not a significant number to establish a truthful conclusion. 
  • Remedy: Researchers say that in order to determine the correct sample size to get truthful and meaningful results, it is necessary to define a margin of error that will represent the maximum amount they want the results to deviate from the statistical mean. Paired with this, they need to define a confidence level that should be between 90 and 99%. With these two values in hand, researchers can calculate an accurate sample size for their studies.

6) Reliability, subjectivity, and generalizability : When performing qualitative analysis, researchers must consider practical and theoretical limitations when interpreting the data. In some cases, this type of research can be considered unreliable because of uncontrolled factors that might or might not affect the results. This is paired with the fact that the researcher has a primary role in the interpretation process, meaning he or she decides what is relevant and what is not, and as we know, interpretations can be very subjective.

Generalizability is also an issue that researchers face when dealing with qualitative analysis. As mentioned in the point about having a small sample size, it is difficult to draw conclusions that are 100% representative because the results might be biased or unrepresentative of a wider population. 

While these factors are mostly present in qualitative research, they can also affect the quantitative analysis. For example, when choosing which KPIs to portray and how to portray them, analysts can also be biased and represent them in a way that benefits their analysis.

  • Digital age example: Biased questions in a survey are a great example of reliability and subjectivity issues. Imagine you are sending a survey to your clients to see how satisfied they are with your customer service with this question: “How amazing was your experience with our customer service team?”. Here, we can see that this question clearly influences the response of the individual by putting the word “amazing” on it. 
  • Remedy: A solution to avoid these issues is to keep your research honest and neutral. Keep the wording of the questions as objective as possible. For example: “On a scale of 1-10, how satisfied were you with our customer service team?”. This does not lead the respondent to any specific answer, meaning the results of your survey will be reliable. 

Data Interpretation Best Practices & Tips

Data interpretation methods and techniques by datapine

Data analysis and interpretation are critical to developing sound conclusions and making better-informed decisions. As we have seen with this article, there is an art and science to the interpretation of data. To help you with this purpose, we will list a few relevant techniques, methods, and tricks you can implement for a successful data management process. 

As mentioned at the beginning of this post, the first step to interpreting data in a successful way is to identify the type of analysis you will perform and apply the methods respectively. Clearly differentiate between qualitative (observe, document, and interview notice, collect and think about things) and quantitative analysis (you lead research with a lot of numerical data to be analyzed through various statistical methods). 

1) Ask the right data interpretation questions

The first data interpretation technique is to define a clear baseline for your work. This can be done by answering some critical questions that will serve as a useful guideline to start. Some of them include: what are the goals and objectives of my analysis? What type of data interpretation method will I use? Who will use this data in the future? And most importantly, what general question am I trying to answer?

Once all this information has been defined, you will be ready for the next step: collecting your data. 

2) Collect and assimilate your data

Now that a clear baseline has been established, it is time to collect the information you will use. Always remember that your methods for data collection will vary depending on what type of analysis method you use, which can be qualitative or quantitative. Based on that, relying on professional online data analysis tools to facilitate the process is a great practice in this regard, as manually collecting and assessing raw data is not only very time-consuming and expensive but is also at risk of errors and subjectivity. 

Once your data is collected, you need to carefully assess it to understand if the quality is appropriate to be used during a study. This means, is the sample size big enough? Were the procedures used to collect the data implemented correctly? Is the date range from the data correct? If coming from an external source, is it a trusted and objective one? 

With all the needed information in hand, you are ready to start the interpretation process, but first, you need to visualize your data. 

3) Use the right data visualization type 

Data visualizations such as business graphs , charts, and tables are fundamental to successfully interpreting data. This is because data visualization via interactive charts and graphs makes the information more understandable and accessible. As you might be aware, there are different types of visualizations you can use, but not all of them are suitable for any analysis purpose. Using the wrong graph can lead to misinterpretation of your data, so it’s very important to carefully pick the right visual for it. Let’s look at some use cases of common data visualizations. 

  • Bar chart: One of the most used chart types, the bar chart uses rectangular bars to show the relationship between 2 or more variables. There are different types of bar charts for different interpretations, including the horizontal bar chart, column bar chart, and stacked bar chart. 
  • Line chart: Most commonly used to show trends, acceleration or decelerations, and volatility, the line chart aims to show how data changes over a period of time, for example, sales over a year. A few tips to keep this chart ready for interpretation are not using many variables that can overcrowd the graph and keeping your axis scale close to the highest data point to avoid making the information hard to read. 
  • Pie chart: Although it doesn’t do a lot in terms of analysis due to its uncomplex nature, pie charts are widely used to show the proportional composition of a variable. Visually speaking, showing a percentage in a bar chart is way more complicated than showing it in a pie chart. However, this also depends on the number of variables you are comparing. If your pie chart needs to be divided into 10 portions, then it is better to use a bar chart instead. 
  • Tables: While they are not a specific type of chart, tables are widely used when interpreting data. Tables are especially useful when you want to portray data in its raw format. They give you the freedom to easily look up or compare individual values while also displaying grand totals. 

With the use of data visualizations becoming more and more critical for businesses’ analytical success, many tools have emerged to help users visualize their data in a cohesive and interactive way. One of the most popular ones is the use of BI dashboards . These visual tools provide a centralized view of various graphs and charts that paint a bigger picture of a topic. We will discuss the power of dashboards for an efficient data interpretation practice in the next portion of this post. If you want to learn more about different types of graphs and charts , take a look at our complete guide on the topic. 

4) Start interpreting 

After the tedious preparation part, you can start extracting conclusions from your data. As mentioned many times throughout the post, the way you decide to interpret the data will solely depend on the methods you initially decided to use. If you had initial research questions or hypotheses, then you should look for ways to prove their validity. If you are going into the data with no defined hypothesis, then start looking for relationships and patterns that will allow you to extract valuable conclusions from the information. 

During the process of interpretation, stay curious and creative, dig into the data, and determine if there are any other critical questions that should be asked. If any new questions arise, you need to assess if you have the necessary information to answer them. Being able to identify if you need to dedicate more time and resources to the research is a very important step. No matter if you are studying customer behaviors or a new cancer treatment, the findings from your analysis may dictate important decisions in the future. Therefore, taking the time to really assess the information is key. For that purpose, data interpretation software proves to be very useful.

5) Keep your interpretation objective

As mentioned above, objectivity is one of the most important data interpretation skills but also one of the hardest. Being the person closest to the investigation, it is easy to become subjective when looking for answers in the data. A good way to stay objective is to show the information related to the study to other people, for example, research partners or even the people who will use your findings once they are done. This can help avoid confirmation bias and any reliability issues with your interpretation. 

Remember, using a visualization tool such as a modern dashboard will make the interpretation process way easier and more efficient as the data can be navigated and manipulated in an easy and organized way. And not just that, using a dashboard tool to present your findings to a specific audience will make the information easier to understand and the presentation way more engaging thanks to the visual nature of these tools. 

6) Mark your findings and draw conclusions

Findings are the observations you extracted from your data. They are the facts that will help you drive deeper conclusions about your research. For example, findings can be trends and patterns you found during your interpretation process. To put your findings into perspective, you can compare them with other resources that use similar methods and use them as benchmarks.

Reflect on your own thinking and reasoning and be aware of the many pitfalls data analysis and interpretation carry—correlation versus causation, subjective bias, false information, inaccurate data, etc. Once you are comfortable with interpreting the data, you will be ready to develop conclusions, see if your initial questions were answered, and suggest recommendations based on them.

Interpretation of Data: The Use of Dashboards Bridging The Gap

As we have seen, quantitative and qualitative methods are distinct types of data interpretation and analysis. Both offer a varying degree of return on investment (ROI) regarding data investigation, testing, and decision-making. But how do you mix the two and prevent a data disconnect? The answer is professional data dashboards. 

For a few years now, dashboards have become invaluable tools to visualize and interpret data. These tools offer a centralized and interactive view of data and provide the perfect environment for exploration and extracting valuable conclusions. They bridge the quantitative and qualitative information gap by unifying all the data in one place with the help of stunning visuals. 

Not only that, but these powerful tools offer a large list of benefits, and we will discuss some of them below. 

1) Connecting and blending data. With today’s pace of innovation, it is no longer feasible (nor desirable) to have bulk data centrally located. As businesses continue to globalize and borders continue to dissolve, it will become increasingly important for businesses to possess the capability to run diverse data analyses absent the limitations of location. Data dashboards decentralize data without compromising on the necessary speed of thought while blending both quantitative and qualitative data. Whether you want to measure customer trends or organizational performance, you now have the capability to do both without the need for a singular selection.

2) Mobile Data. Related to the notion of “connected and blended data” is that of mobile data. In today’s digital world, employees are spending less time at their desks and simultaneously increasing production. This is made possible because mobile solutions for analytical tools are no longer standalone. Today, mobile analysis applications seamlessly integrate with everyday business tools. In turn, both quantitative and qualitative data are now available on-demand where they’re needed, when they’re needed, and how they’re needed via interactive online dashboards .

3) Visualization. Data dashboards merge the data gap between qualitative and quantitative data interpretation methods through the science of visualization. Dashboard solutions come “out of the box” and are well-equipped to create easy-to-understand data demonstrations. Modern online data visualization tools provide a variety of color and filter patterns, encourage user interaction, and are engineered to help enhance future trend predictability. All of these visual characteristics make for an easy transition among data methods – you only need to find the right types of data visualization to tell your data story the best way possible.

4) Collaboration. Whether in a business environment or a research project, collaboration is key in data interpretation and analysis. Dashboards are online tools that can be easily shared through a password-protected URL or automated email. Through them, users can collaborate and communicate through the data in an efficient way. Eliminating the need for infinite files with lost updates. Tools such as datapine offer real-time updates, meaning your dashboards will update on their own as soon as new information is available.  

Examples Of Data Interpretation In Business

To give you an idea of how a dashboard can fulfill the need to bridge quantitative and qualitative analysis and help in understanding how to interpret data in research thanks to visualization, below, we will discuss three valuable examples to put their value into perspective.

1. Customer Satisfaction Dashboard 

This market research dashboard brings together both qualitative and quantitative data that are knowledgeably analyzed and visualized in a meaningful way that everyone can understand, thus empowering any viewer to interpret it. Let’s explore it below. 

Data interpretation example on customers' satisfaction with a brand

**click to enlarge**

The value of this template lies in its highly visual nature. As mentioned earlier, visuals make the interpretation process way easier and more efficient. Having critical pieces of data represented with colorful and interactive icons and graphs makes it possible to uncover insights at a glance. For example, the colors green, yellow, and red on the charts for the NPS and the customer effort score allow us to conclude that most respondents are satisfied with this brand with a short glance. A further dive into the line chart below can help us dive deeper into this conclusion, as we can see both metrics developed positively in the past 6 months. 

The bottom part of the template provides visually stunning representations of different satisfaction scores for quality, pricing, design, and service. By looking at these, we can conclude that, overall, customers are satisfied with this company in most areas. 

2. Brand Analysis Dashboard

Next, in our list of data interpretation examples, we have a template that shows the answers to a survey on awareness for Brand D. The sample size is listed on top to get a perspective of the data, which is represented using interactive charts and graphs. 

Data interpretation example using a market research dashboard for brand awareness analysis

When interpreting information, context is key to understanding it correctly. For that reason, the dashboard starts by offering insights into the demographics of the surveyed audience. In general, we can see ages and gender are diverse. Therefore, we can conclude these brands are not targeting customers from a specified demographic, an important aspect to put the surveyed answers into perspective. 

Looking at the awareness portion, we can see that brand B is the most popular one, with brand D coming second on both questions. This means brand D is not doing wrong, but there is still room for improvement compared to brand B. To see where brand D could improve, the researcher could go into the bottom part of the dashboard and consult the answers for branding themes and celebrity analysis. These are important as they give clear insight into what people and messages the audience associates with brand D. This is an opportunity to exploit these topics in different ways and achieve growth and success. 

3. Product Innovation Dashboard 

Our third and last dashboard example shows the answers to a survey on product innovation for a technology company. Just like the previous templates, the interactive and visual nature of the dashboard makes it the perfect tool to interpret data efficiently and effectively. 

Market research results on product innovation, useful for product development and pricing decisions as an example of data interpretation using dashboards

Starting from right to left, we first get a list of the top 5 products by purchase intention. This information lets us understand if the product being evaluated resembles what the audience already intends to purchase. It is a great starting point to see how customers would respond to the new product. This information can be complemented with other key metrics displayed in the dashboard. For example, the usage and purchase intention track how the market would receive the product and if they would purchase it, respectively. Interpreting these values as positive or negative will depend on the company and its expectations regarding the survey. 

Complementing these metrics, we have the willingness to pay. Arguably, one of the most important metrics to define pricing strategies. Here, we can see that most respondents think the suggested price is a good value for money. Therefore, we can interpret that the product would sell for that price. 

To see more data analysis and interpretation examples for different industries and functions, visit our library of business dashboards .

To Conclude…

As we reach the end of this insightful post about data interpretation and analysis, we hope you have a clear understanding of the topic. We've covered the definition and given some examples and methods to perform a successful interpretation process.

The importance of data interpretation is undeniable. Dashboards not only bridge the information gap between traditional data interpretation methods and technology, but they can help remedy and prevent the major pitfalls of the process. As a digital age solution, they combine the best of the past and the present to allow for informed decision-making with maximum data interpretation ROI.

To start visualizing your insights in a meaningful and actionable way, test our online reporting software for free with our 14-day trial !

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Can J Hosp Pharm
  • v.68(4); Jul-Aug 2015

Logo of cjhp

Creating a Data Analysis Plan: What to Consider When Choosing Statistics for a Study

There are three kinds of lies: lies, damned lies, and statistics. – Mark Twain 1

INTRODUCTION

Statistics represent an essential part of a study because, regardless of the study design, investigators need to summarize the collected information for interpretation and presentation to others. It is therefore important for us to heed Mr Twain’s concern when creating the data analysis plan. In fact, even before data collection begins, we need to have a clear analysis plan that will guide us from the initial stages of summarizing and describing the data through to testing our hypotheses.

The purpose of this article is to help you create a data analysis plan for a quantitative study. For those interested in conducting qualitative research, previous articles in this Research Primer series have provided information on the design and analysis of such studies. 2 , 3 Information in the current article is divided into 3 main sections: an overview of terms and concepts used in data analysis, a review of common methods used to summarize study data, and a process to help identify relevant statistical tests. My intention here is to introduce the main elements of data analysis and provide a place for you to start when planning this part of your study. Biostatistical experts, textbooks, statistical software packages, and other resources can certainly add more breadth and depth to this topic when you need additional information and advice.

TERMS AND CONCEPTS USED IN DATA ANALYSIS

When analyzing information from a quantitative study, we are often dealing with numbers; therefore, it is important to begin with an understanding of the source of the numbers. Let us start with the term variable , which defines a specific item of information collected in a study. Examples of variables include age, sex or gender, ethnicity, exercise frequency, weight, treatment group, and blood glucose. Each variable will have a group of categories, which are referred to as values , to help describe the characteristic of an individual study participant. For example, the variable “sex” would have values of “male” and “female”.

Although variables can be defined or grouped in various ways, I will focus on 2 methods at this introductory stage. First, variables can be defined according to the level of measurement. The categories in a nominal variable are names, for example, male and female for the variable “sex”; white, Aboriginal, black, Latin American, South Asian, and East Asian for the variable “ethnicity”; and intervention and control for the variable “treatment group”. Nominal variables with only 2 categories are also referred to as dichotomous variables because the study group can be divided into 2 subgroups based on information in the variable. For example, a study sample can be split into 2 groups (patients receiving the intervention and controls) using the dichotomous variable “treatment group”. An ordinal variable implies that the categories can be placed in a meaningful order, as would be the case for exercise frequency (never, sometimes, often, or always). Nominal-level and ordinal-level variables are also referred to as categorical variables, because each category in the variable can be completely separated from the others. The categories for an interval variable can be placed in a meaningful order, with the interval between consecutive categories also having meaning. Age, weight, and blood glucose can be considered as interval variables, but also as ratio variables, because the ratio between values has meaning (e.g., a 15-year-old is half the age of a 30-year-old). Interval-level and ratio-level variables are also referred to as continuous variables because of the underlying continuity among categories.

As we progress through the levels of measurement from nominal to ratio variables, we gather more information about the study participant. The amount of information that a variable provides will become important in the analysis stage, because we lose information when variables are reduced or aggregated—a common practice that is not recommended. 4 For example, if age is reduced from a ratio-level variable (measured in years) to an ordinal variable (categories of < 65 and ≥ 65 years) we lose the ability to make comparisons across the entire age range and introduce error into the data analysis. 4

A second method of defining variables is to consider them as either dependent or independent. As the terms imply, the value of a dependent variable depends on the value of other variables, whereas the value of an independent variable does not rely on other variables. In addition, an investigator can influence the value of an independent variable, such as treatment-group assignment. Independent variables are also referred to as predictors because we can use information from these variables to predict the value of a dependent variable. Building on the group of variables listed in the first paragraph of this section, blood glucose could be considered a dependent variable, because its value may depend on values of the independent variables age, sex, ethnicity, exercise frequency, weight, and treatment group.

Statistics are mathematical formulae that are used to organize and interpret the information that is collected through variables. There are 2 general categories of statistics, descriptive and inferential. Descriptive statistics are used to describe the collected information, such as the range of values, their average, and the most common category. Knowledge gained from descriptive statistics helps investigators learn more about the study sample. Inferential statistics are used to make comparisons and draw conclusions from the study data. Knowledge gained from inferential statistics allows investigators to make inferences and generalize beyond their study sample to other groups.

Before we move on to specific descriptive and inferential statistics, there are 2 more definitions to review. Parametric statistics are generally used when values in an interval-level or ratio-level variable are normally distributed (i.e., the entire group of values has a bell-shaped curve when plotted by frequency). These statistics are used because we can define parameters of the data, such as the centre and width of the normally distributed curve. In contrast, interval-level and ratio-level variables with values that are not normally distributed, as well as nominal-level and ordinal-level variables, are generally analyzed using nonparametric statistics.

METHODS FOR SUMMARIZING STUDY DATA: DESCRIPTIVE STATISTICS

The first step in a data analysis plan is to describe the data collected in the study. This can be done using figures to give a visual presentation of the data and statistics to generate numeric descriptions of the data.

Selection of an appropriate figure to represent a particular set of data depends on the measurement level of the variable. Data for nominal-level and ordinal-level variables may be interpreted using a pie graph or bar graph . Both options allow us to examine the relative number of participants within each category (by reporting the percentages within each category), whereas a bar graph can also be used to examine absolute numbers. For example, we could create a pie graph to illustrate the proportions of men and women in a study sample and a bar graph to illustrate the number of people who report exercising at each level of frequency (never, sometimes, often, or always).

Interval-level and ratio-level variables may also be interpreted using a pie graph or bar graph; however, these types of variables often have too many categories for such graphs to provide meaningful information. Instead, these variables may be better interpreted using a histogram . Unlike a bar graph, which displays the frequency for each distinct category, a histogram displays the frequency within a range of continuous categories. Information from this type of figure allows us to determine whether the data are normally distributed. In addition to pie graphs, bar graphs, and histograms, many other types of figures are available for the visual representation of data. Interested readers can find additional types of figures in the books recommended in the “Further Readings” section.

Figures are also useful for visualizing comparisons between variables or between subgroups within a variable (for example, the distribution of blood glucose according to sex). Box plots are useful for summarizing information for a variable that does not follow a normal distribution. The lower and upper limits of the box identify the interquartile range (or 25th and 75th percentiles), while the midline indicates the median value (or 50th percentile). Scatter plots provide information on how the categories for one continuous variable relate to categories in a second variable; they are often helpful in the analysis of correlations.

In addition to using figures to present a visual description of the data, investigators can use statistics to provide a numeric description. Regardless of the measurement level, we can find the mode by identifying the most frequent category within a variable. When summarizing nominal-level and ordinal-level variables, the simplest method is to report the proportion of participants within each category.

The choice of the most appropriate descriptive statistic for interval-level and ratio-level variables will depend on how the values are distributed. If the values are normally distributed, we can summarize the information using the parametric statistics of mean and standard deviation. The mean is the arithmetic average of all values within the variable, and the standard deviation tells us how widely the values are dispersed around the mean. When values of interval-level and ratio-level variables are not normally distributed, or we are summarizing information from an ordinal-level variable, it may be more appropriate to use the nonparametric statistics of median and range. The first step in identifying these descriptive statistics is to arrange study participants according to the variable categories from lowest value to highest value. The range is used to report the lowest and highest values. The median or 50th percentile is located by dividing the number of participants into 2 groups, such that half (50%) of the participants have values above the median and the other half (50%) have values below the median. Similarly, the 25th percentile is the value with 25% of the participants having values below and 75% of the participants having values above, and the 75th percentile is the value with 75% of participants having values below and 25% of participants having values above. Together, the 25th and 75th percentiles define the interquartile range .

PROCESS TO IDENTIFY RELEVANT STATISTICAL TESTS: INFERENTIAL STATISTICS

One caveat about the information provided in this section: selecting the most appropriate inferential statistic for a specific study should be a combination of following these suggestions, seeking advice from experts, and discussing with your co-investigators. My intention here is to give you a place to start a conversation with your colleagues about the options available as you develop your data analysis plan.

There are 3 key questions to consider when selecting an appropriate inferential statistic for a study: What is the research question? What is the study design? and What is the level of measurement? It is important for investigators to carefully consider these questions when developing the study protocol and creating the analysis plan. The figures that accompany these questions show decision trees that will help you to narrow down the list of inferential statistics that would be relevant to a particular study. Appendix 1 provides brief definitions of the inferential statistics named in these figures. Additional information, such as the formulae for various inferential statistics, can be obtained from textbooks, statistical software packages, and biostatisticians.

What Is the Research Question?

The first step in identifying relevant inferential statistics for a study is to consider the type of research question being asked. You can find more details about the different types of research questions in a previous article in this Research Primer series that covered questions and hypotheses. 5 A relational question seeks information about the relationship among variables; in this situation, investigators will be interested in determining whether there is an association ( Figure 1 ). A causal question seeks information about the effect of an intervention on an outcome; in this situation, the investigator will be interested in determining whether there is a difference ( Figure 2 ).

An external file that holds a picture, illustration, etc.
Object name is cjhp-68-311f1.jpg

Decision tree to identify inferential statistics for an association.

An external file that holds a picture, illustration, etc.
Object name is cjhp-68-311f2.jpg

Decision tree to identify inferential statistics for measuring a difference.

What Is the Study Design?

When considering a question of association, investigators will be interested in measuring the relationship between variables ( Figure 1 ). A study designed to determine whether there is consensus among different raters will be measuring agreement. For example, an investigator may be interested in determining whether 2 raters, using the same assessment tool, arrive at the same score. Correlation analyses examine the strength of a relationship or connection between 2 variables, like age and blood glucose. Regression analyses also examine the strength of a relationship or connection; however, in this type of analysis, one variable is considered an outcome (or dependent variable) and the other variable is considered a predictor (or independent variable). Regression analyses often consider the influence of multiple predictors on an outcome at the same time. For example, an investigator may be interested in examining the association between a treatment and blood glucose, while also considering other factors, like age, sex, ethnicity, exercise frequency, and weight.

When considering a question of difference, investigators must first determine how many groups they will be comparing. In some cases, investigators may be interested in comparing the characteristic of one group with that of an external reference group. For example, is the mean age of study participants similar to the mean age of all people in the target group? If more than one group is involved, then investigators must also determine whether there is an underlying connection between the sets of values (or samples ) to be compared. Samples are considered independent or unpaired when the information is taken from different groups. For example, we could use an unpaired t test to compare the mean age between 2 independent samples, such as the intervention and control groups in a study. Samples are considered related or paired if the information is taken from the same group of people, for example, measurement of blood glucose at the beginning and end of a study. Because blood glucose is measured in the same people at both time points, we could use a paired t test to determine whether there has been a significant change in blood glucose.

What Is the Level of Measurement?

As described in the first section of this article, variables can be grouped according to the level of measurement (nominal, ordinal, or interval). In most cases, the independent variable in an inferential statistic will be nominal; therefore, investigators need to know the level of measurement for the dependent variable before they can select the relevant inferential statistic. Two exceptions to this consideration are correlation analyses and regression analyses ( Figure 1 ). Because a correlation analysis measures the strength of association between 2 variables, we need to consider the level of measurement for both variables. Regression analyses can consider multiple independent variables, often with a variety of measurement levels. However, for these analyses, investigators still need to consider the level of measurement for the dependent variable.

Selection of inferential statistics to test interval-level variables must include consideration of how the data are distributed. An underlying assumption for parametric tests is that the data approximate a normal distribution. When the data are not normally distributed, information derived from a parametric test may be wrong. 6 When the assumption of normality is violated (for example, when the data are skewed), then investigators should use a nonparametric test. If the data are normally distributed, then investigators can use a parametric test.

ADDITIONAL CONSIDERATIONS

What is the level of significance.

An inferential statistic is used to calculate a p value, the probability of obtaining the observed data by chance. Investigators can then compare this p value against a prespecified level of significance, which is often chosen to be 0.05. This level of significance represents a 1 in 20 chance that the observation is wrong, which is considered an acceptable level of error.

What Are the Most Commonly Used Statistics?

In 1983, Emerson and Colditz 7 reported the first review of statistics used in original research articles published in the New England Journal of Medicine . This review of statistics used in the journal was updated in 1989 and 2005, 8 and this type of analysis has been replicated in many other journals. 9 – 13 Collectively, these reviews have identified 2 important observations. First, the overall sophistication of statistical methodology used and reported in studies has grown over time, with survival analyses and multivariable regression analyses becoming much more common. The second observation is that, despite this trend, 1 in 4 articles describe no statistical methods or report only simple descriptive statistics. When inferential statistics are used, the most common are t tests, contingency table tests (for example, χ 2 test and Fisher exact test), and simple correlation and regression analyses. This information is important for educators, investigators, reviewers, and readers because it suggests that a good foundational knowledge of descriptive statistics and common inferential statistics will enable us to correctly evaluate the majority of research articles. 11 – 13 However, to fully take advantage of all research published in high-impact journals, we need to become acquainted with some of the more complex methods, such as multivariable regression analyses. 8 , 13

What Are Some Additional Resources?

As an investigator and Associate Editor with CJHP , I have often relied on the advice of colleagues to help create my own analysis plans and review the plans of others. Biostatisticians have a wealth of knowledge in the field of statistical analysis and can provide advice on the correct selection, application, and interpretation of these methods. Colleagues who have “been there and done that” with their own data analysis plans are also valuable sources of information. Identify these individuals and consult with them early and often as you develop your analysis plan.

Another important resource to consider when creating your analysis plan is textbooks. Numerous statistical textbooks are available, differing in levels of complexity and scope. The titles listed in the “Further Reading” section are just a few suggestions. I encourage interested readers to look through these and other books to find resources that best fit their needs. However, one crucial book that I highly recommend to anyone wanting to be an investigator or peer reviewer is Lang and Secic’s How to Report Statistics in Medicine (see “Further Reading”). As the title implies, this book covers a wide range of statistics used in medical research and provides numerous examples of how to correctly report the results.

CONCLUSIONS

When it comes to creating an analysis plan for your project, I recommend following the sage advice of Douglas Adams in The Hitchhiker’s Guide to the Galaxy : Don’t panic! 14 Begin with simple methods to summarize and visualize your data, then use the key questions and decision trees provided in this article to identify relevant statistical tests. Information in this article will give you and your co-investigators a place to start discussing the elements necessary for developing an analysis plan. But do not stop there! Use advice from biostatisticians and more experienced colleagues, as well as information in textbooks, to help create your analysis plan and choose the most appropriate statistics for your study. Making careful, informed decisions about the statistics to use in your study should reduce the risk of confirming Mr Twain’s concern.

Appendix 1. Glossary of statistical terms * (part 1 of 2)

  • 1-way ANOVA: Uses 1 variable to define the groups for comparing means. This is similar to the Student t test when comparing the means of 2 groups.
  • Kruskall–Wallis 1-way ANOVA: Nonparametric alternative for the 1-way ANOVA. Used to determine the difference in medians between 3 or more groups.
  • n -way ANOVA: Uses 2 or more variables to define groups when comparing means. Also called a “between-subjects factorial ANOVA”.
  • Repeated-measures ANOVA: A method for analyzing whether the means of 3 or more measures from the same group of participants are different.
  • Freidman ANOVA: Nonparametric alternative for the repeated-measures ANOVA. It is often used to compare rankings and preferences that are measured 3 or more times.
  • Fisher exact: Variation of chi-square that accounts for cell counts < 5.
  • McNemar: Variation of chi-square that tests statistical significance of changes in 2 paired measurements of dichotomous variables.
  • Cochran Q: An extension of the McNemar test that provides a method for testing for differences between 3 or more matched sets of frequencies or proportions. Often used as a measure of heterogeneity in meta-analyses.
  • 1-sample: Used to determine whether the mean of a sample is significantly different from a known or hypothesized value.
  • Independent-samples t test (also referred to as the Student t test): Used when the independent variable is a nominal-level variable that identifies 2 groups and the dependent variable is an interval-level variable.
  • Paired: Used to compare 2 pairs of scores between 2 groups (e.g., baseline and follow-up blood pressure in the intervention and control groups).

Lang TA, Secic M. How to report statistics in medicine: annotated guidelines for authors, editors, and reviewers. 2nd ed. Philadelphia (PA): American College of Physicians; 2006.

Norman GR, Streiner DL. PDQ statistics. 3rd ed. Hamilton (ON): B.C. Decker; 2003.

Plichta SB, Kelvin E. Munro’s statistical methods for health care research . 6th ed. Philadelphia (PA): Wolters Kluwer Health/ Lippincott, Williams & Wilkins; 2013.

This article is the 12th in the CJHP Research Primer Series, an initiative of the CJHP Editorial Board and the CSHP Research Committee. The planned 2-year series is intended to appeal to relatively inexperienced researchers, with the goal of building research capacity among practising pharmacists. The articles, presenting simple but rigorous guidance to encourage and support novice researchers, are being solicited from authors with appropriate expertise.

Previous articles in this series:

  • Bond CM. The research jigsaw: how to get started. Can J Hosp Pharm . 2014;67(1):28–30.
  • Tully MP. Research: articulating questions, generating hypotheses, and choosing study designs. Can J Hosp Pharm . 2014;67(1):31–4.
  • Loewen P. Ethical issues in pharmacy practice research: an introductory guide. Can J Hosp Pharm. 2014;67(2):133–7.
  • Tsuyuki RT. Designing pharmacy practice research trials. Can J Hosp Pharm . 2014;67(3):226–9.
  • Bresee LC. An introduction to developing surveys for pharmacy practice research. Can J Hosp Pharm . 2014;67(4):286–91.
  • Gamble JM. An introduction to the fundamentals of cohort and case–control studies. Can J Hosp Pharm . 2014;67(5):366–72.
  • Austin Z, Sutton J. Qualitative research: getting started. C an J Hosp Pharm . 2014;67(6):436–40.
  • Houle S. An introduction to the fundamentals of randomized controlled trials in pharmacy research. Can J Hosp Pharm . 2014; 68(1):28–32.
  • Charrois TL. Systematic reviews: What do you need to know to get started? Can J Hosp Pharm . 2014;68(2):144–8.
  • Sutton J, Austin Z. Qualitative research: data collection, analysis, and management. Can J Hosp Pharm . 2014;68(3):226–31.
  • Cadarette SM, Wong L. An introduction to health care administrative data. Can J Hosp Pharm. 2014;68(3):232–7.

Competing interests: None declared.

Further Reading

  • Devor J, Peck R. Statistics: the exploration and analysis of data. 7th ed. Boston (MA): Brooks/Cole Cengage Learning; 2012. [ Google Scholar ]
  • Lang TA, Secic M. How to report statistics in medicine: annotated guidelines for authors, editors, and reviewers. 2nd ed. Philadelphia (PA): American College of Physicians; 2006. [ Google Scholar ]
  • Mendenhall W, Beaver RJ, Beaver BM. Introduction to probability and statistics. 13th ed. Belmont (CA): Brooks/Cole Cengage Learning; 2009. [ Google Scholar ]
  • Norman GR, Streiner DL. PDQ statistics. 3rd ed. Hamilton (ON): B.C. Decker; 2003. [ Google Scholar ]
  • Plichta SB, Kelvin E. Munro’s statistical methods for health care research. 6th ed. Philadelphia (PA): Wolters Kluwer Health/Lippincott, Williams & Wilkins; 2013. [ Google Scholar ]
  • Search Menu
  • Sign in through your institution
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • Ethnic Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Politics of Development
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Qualitative Research

A newer edition of this book is available.

  • < Previous chapter
  • Next chapter >

30 Interpretation Strategies: Appropriate Concepts

Allen Trent, College of Education, University of Wyoming

Jeasik Cho, Department of Educational Studies, University of Wyoming

  • Published: 04 August 2014
  • Cite Icon Cite
  • Permissions Icon Permissions

This essay addresses a wide range of concepts related to interpretation in qualitative research, examines the meaning and importance of interpretation in qualitative inquiry, and explores the ways methodology, data, and the self/researcher as instrument interact and impact interpretive processes. Additionally, the essay presents a series of strategies for qualitative researchers engaged in the process of interpretation. The article closes by presenting a framework for qualitative researchers designed to inform their interpretations. The framework includes attention to the key qualitative research concepts transparency, reflexivity, analysis, validity, evidence, and literature. Four questions frame the article: What is interpretation, and why are interpretive strategies important in qualitative research? How do methodology, data, and the researcher/self impact interpretation in qualitative research? How do qualitative researchers engage in the process of interpretation? And, in what ways can a framework for interpretation strategies support qualitative researchers across multiple methodologies and paradigms?

“All human knowledge takes the form of interpretation.” In this seemingly simple statement, the late German philosopher Walter Benjamin asserts that all knowledge is mediated and constructed. He makes no distinction between physical and social sciences, and so situates himself as an interpretivist, one who believes that human subjectivity, individuals’ characteristics, feelings, opinions, and experiential backgrounds impact observations, analysis of these observations, and resultant knowledge/truth constructions. Contrast this perspective with positivist claims that knowledge is based exclusively on external facts, objectively observed and recorded. Interpretivists then, acknowledge that, if positivistic notions of knowledge and truth are inadequate to explain social phenomena, then positivist, hard science approaches to research (i.e., the scientific method and its variants) are also inadequate. So, although the literature often contrasts quantitative and qualitative research as largely a difference in kinds of data employed (numerical vs. linguistic), instead, the primary differentiation is in the foundational, paradigmatic assumptions about truth, knowledge, and objectivity.

This chapter is about interpretation and the strategies that qualitative researchers use to interpret a wide variety of “texts.” Knowledge, we assert, is constructed, both individually (constructivism) and socially (constructionism). We accept this as our starting point. Our aim here is to share our perspective on a broad set of concepts associated with the interpretive or meaning-making process. Although it may happen at different times and in different ways, interpretation is a part of almost all qualitative research.

Qualitative research is an umbrella term that encompasses a wide array of paradigmatic views, goals, and methods. Still, there are key unifying elements that include a generally constructionist epistemological standpoint, attention to primarily linguistic data, and generally accepted protocols or syntax for conducting research. Typically, qualitative researchers begin with a starting point—a curiosity, a problem in need of solutions, a research question, or a desire to better understand a situation from the perspectives of the individuals who inhabit that context (what qualitative researchers call the “emic,” or insider’s, perspective).

From this starting point, researchers determine the appropriate kinds of data to collect, engage in fieldwork as participant-observers to gather these data, organize the data, look for patterns, and then attempt to make sense out of the data by synthesizing research “findings,” “assertions,” or “theories” in ways that can be shared so that others may also gain insights from the conducted inquiry.

Although there are commonalities that cut across most forms of qualitative research, this is not to say that there is an accepted, linear, standardized approach. To be sure, there are an infinite number of variations and nuances in the qualitative research process. For example, some forms of inquiry begin with a firm research question, others without even a clear focus for study. Grounded theorists begin data analysis and interpretation very early in the research process, whereas some case study researchers, for example, may collect data in the field for a period of time before seriously considering the data and its implications. Some ethnographers may be a part of the context (e.g., observing in classrooms), but they may assume more observer-like roles, as opposed to actively participating in the context. Alternatively, action researchers, in studying issues about their own practice, are necessarily situated toward the “participant” end of the participant–observer continuum.

Our focus here is on one integrated part of the qualitative research process, interpretation, the process of collective and individual “meaning making.” As we discuss throughout this chapter, researchers take a variety of approaches to interpretation in qualitative work. Four general questions guide our explorations:

What is interpretation, and why are interpretive strategies important in qualitative research?

How do methodology, data, and the researcher/self impact interpretation in qualitative research?

How do qualitative researchers engage in the process of interpretation?

In what ways can a framework for interpretation strategies support qualitative researchers across multiple methodological and paradigmatic views?

We address each of these guiding questions in our attempt to explicate our interpretation of “interpretation,” and, as educational researchers, we include examples from our own work to illustrate some key concepts.

What Is Interpretation, and Why Are Interpretive Strategies Important in Qualitative Research?

Qualitative researchers and those writing about qualitative methods often intertwine the terms analysis and interpretation . For example, Hubbard and Power (2003) describe data analysis as, “bringing order, structure, and meaning to the data” (p. 88). To us, this description combines analysis with interpretation. Although there is nothing wrong with this construction, our understanding aligns more closely with Mills’s (2007) claim that, “put simply, analysis involves summarizing what’s in the data, whereas interpretation involves making sense of—finding meaning in—that data” (p. 122). For the purpose of this chapter, we’ll adhere to Mills’s distinction, understanding analysis as summarizing and organizing, and interpretation as meaning making. Unavoidably, these closely related processes overlap and interact, but our focus will be primarily on the more complex of these endeavors, interpretation. Interpretation, in this sense, is in part translation, but translation is not an objective act. Instead, translation necessarily involves selectivity and the ascribing of meaning. Qualitative researchers “aim beneath manifest behavior to the meaning events have for those who experience them” ( Eisner, 1991 , p. 35). The presentation of these insider/emic perspectives is a hallmark of qualitative research.

Qualitative researchers have long borrowed from extant models for fieldwork and interpretation. Approaches from anthropology and the arts have become especially prominent. For example, Eisner’s form of qualitative inquiry, “educational criticism” (1991), draws heavily on accepted models of art criticism. Barrett (2000) , an authority on art criticism, describes interpretation as a complex set of processes based on a set of principles. We believe many of these principles apply as readily to qualitative research as they do to critique. The following principles, adapted from Barrett’s principles of interpretation (2000, pp. 113–120), inform our examination:

Qualitative phenomena have “aboutness ”: All social phenomena have meaning, but meanings in this context can be multiple, even contradictory.

Interpretations are persuasive arguments : All interpretations are arguments, and qualitative researchers, like critics, strive to build strong arguments grounded in the information, or data, available.

Some interpretations are better than others : Barrett notes that, “some interpretations are better argued, better grounded with evidence, and therefore more reasonable, more certain, and more acceptable than others” (p. 115). This contradicts the argument that “all interpretations are equal,” heard in the common refrain, “well, that’s just your interpretation.”

There can be different, competing, and contradictory interpretations of the same phenomena : As noted at the beginning of this chapter, we acknowledge that subjectivity matters, and, unavoidably, it impacts one’s interpretations. As Barrett notes (2000) , “Interpretations are often based on a worldview” (p. 116).

Interpretations are not (and can’t be) “right,” but instead, they can be more or less reasonable, convincing, and informative : There is never one “true” interpretation, but some interpretations are more compelling than others.

Interpretations can be judged by coherence, correspondence, and inclusiveness : Does the argument/interpretation make sense (coherence)? Does the interpretation fit the data (correspondence)? Have all data been attended to, including outlier data that don’t necessarily support identified themes (inclusiveness)?

Interpretation is ultimately a communal endeavor : Initial interpretations may be incomplete, nearsighted, and/or narrow, but eventually, these interpretations become richer, broader, and more inclusive. Feminist revisionist history projects are an exemplary case. Over time, the writing, art, and cultural contributions of countless women, previously ignored, diminished, or distorted, have come to be accepted as prominent contributions given serious consideration.

So, meaning is conferred; interpretations are socially constructed arguments; multiple interpretations are to be expected; and some interpretations are better than others. As we discuss later in this chapter, what makes an interpretation “better” often hinges on the purpose/goals of the research in question. Interpretations designed to generate theory, or generalizable rules, will be “better” for responding to research questions aligned with the aims of more traditional quantitative/positivist research, whereas interpretations designed to construct meanings through social interaction, to generate multiple perspectives, and to represent the context-specific perspectives of the research participants are “better” for researchers constructing thick, contextually rich descriptions, stories, or narratives. The former relies on more “atomistic” interpretive strategies, whereas the latter adheres to a more “holistic” approach ( Willis, 2007 ). Both approaches to analysis/interpretation are addressed in more detail later in this chapter.

At this point, readers might ask, why does interpretation matter, anyway? Our response to this question involves the distinctive nature of interpretation and the ability of the interpretive process to put unique fingerprints on an otherwise relatively static set of data. Once interview data are collected and transcribed (and we realize that even the process of transcription is, in part, interpretive), documents are collected, and observations are recorded, qualitative researchers could just, in good faith and with fidelity, represent the data in as straightforward ways as possible, allowing readers to “see for themselves” by sharing as much actual data (e.g., the transcribed words of the research participants) as possible. This approach, however, includes analysis, what we have defined as summarizing and organizing data for presentation, but it falls short of what we actually reference and define as interpretation—attempting to explain the meaning of others’ words and actions. “While early efforts at qualitative research might have stopped at description, it is now more generally accepted that a qualitative researcher adds understanding and interpretation to the description” ( Lichtman, 2006 , p. 8).

As we are fond of the arts and arts-based approaches to qualitative research, an example from the late jazz drummer, Buddy Rich, seems fitting. Rich explains the importance of having the flexibility to interpret: “I don’t think any arranger should ever write a drum part for a drummer, because if a drummer can’t create his own interpretation of the chart, and he plays everything that’s written, he becomes mechanical; he has no freedom.” The same is true for qualitative researchers; without the freedom to interpret, the researcher merely regurgitates, attempting to share with readers/reviewers exactly what the research subjects shared with him or her. It is only through interpretation that the researcher, as collaborator with unavoidable subjectivities, is able to construct unique, contextualized meaning. Interpretation then, in this sense, is knowledge construction.

In closing this section, we’ll illustrate the analysis versus interpretation distinction with the following transcript excerpt. In this study, the authors ( Trent & Zorko, 2006 ) were studying student teaching from the perspective of K–12 students. This quote comes from a high school student in a focus group interview. She is describing a student teacher she had:

The right-hand column contains “codes” or labels applied to parts of the transcript text. Coding will be discussed in more depth later in this chapter, but, for now, note that the codes are mostly summarizing the main ideas of the text, sometimes using the exact words of the research participant. This type of coding is a part of what we’ve called analysis—organizing and summarizing the data. It’s a way of beginning to say, “what is” there. As noted, though, most qualitative researchers go deeper. They want to know more than “what is”; they also ask, “what does it mean?” This is a question of interpretation.

Specific to the transcript excerpt, researchers might next begin to cluster the early codes into like groups. For example, the teacher “felt targeted,” “assumed kids were going to behave inappropriately,” and appeared to be “overwhelmed.” A researcher might cluster this group of codes in a category called “teacher feelings and perceptions” and may then cluster the codes “could not control class,” and “students off task” into a category called “classroom management.” The researcher then, in taking a fresh look at these categories and the included codes, may begin to conclude that what’s going on in this situation is that the student teacher does not have sufficient training in classroom management models and strategies and may also be lacking the skills she needs to build relationships with her students. These then would be interpretations, persuasive arguments connected to the study’s data. In this specific example, the researchers might proceed to write a memo about these emerging interpretations. In this memo, they might more clearly define their early categories and may also look through other data to see if there are other codes or categories that align with or overlap with this initial analysis. They might write further about their emergent interpretations and, in doing so, may inform future data collection in ways that will allow them to either support or refute their early interpretations. These researchers will also likely find that the processes of analysis and interpretation are inextricably intertwined. Good interpretations very often depend on thorough and thoughtful analyses.

How Do Methodology, Data, and the Researcher/Self Impact Interpretation in Qualitative Research?

Methodological conventions guide interpretation and the use of interpretive strategies. For example, in grounded theory and in similar methodological traditions, “formal analysis begins early in the study and is nearly completed by the end of data collection” ( Bogdan & Biklen, 2003 , p. 66). Alternatively, for researchers from other traditions, for example, case study researchers, “Formal analysis and theory development [interpretation] do not occur until after the data collection is near complete” (p. 66).

Researchers subscribing to methodologies that prescribe early data analysis and interpretation may employ methods like analytic induction or the constant comparison method. In using analytic induction, researchers develop a rough definition of the phenomena under study; collect data to compare to this rough definition; modify the definition as needed, based on cases that both fit and don’t fit the definition; and finally, establish a clear, universal definition (theory) of the phenomena (Robinson, 1951, cited in Bogdan & Biklen, 2003 , p. 65). Generally, those using a constant comparison approach begin data collection immediately; identify key issues, events, and activities related to the study that then become categories of focus; collect data that provide incidents of these categories; write about and describe the categories, accounting for specific incidents and seeking others; discover basic processes and relationships; and, finally, code and write about the categories as theory, “grounded” in the data ( Glaser, 1965 ). Although processes like analytic induction and constant comparison can be listed as “steps” to follow, in actuality, these are more typically recursive processes in which the researcher repeatedly goes back and forth between the data and emerging analyses and interpretations.

In addition to methodological conventions that prescribe data analysis early (e.g., grounded theory) or later (e.g., case study) in the inquiry process, methodological approaches also impact the general approach to analysis and interpretation. Ellingson (2011) situates qualitative research methodologies on a continuum spanning “science”-like approaches on one end juxtaposed with “art”-like approaches on the other.

Researchers pursuing a more science-oriented approach seek valid, reliable, generalizable knowledge; believe in neutral, objective researchers; and ultimately claim single, authoritative interpretations. Researchers adhering to these science-focused, post-positivistic approaches may count frequencies, emphasize the validity of the employed coding system, and point to intercoder reliability and random sampling as criteria that bolsters the research credibility. Researchers at or near the science end of the continuum might employ analysis and interpretation strategies that include “paired comparisons,” “pile sorts,” “word counts,” identifying “key words in context,” and “triad tests” ( Ryan & Bernard, 2000 , pp. 770–776). These researchers may ultimately seek to develop taxonomies or other authoritative final products that organize and explain the collected data.

For example, in a study we conducted about preservice teachers’ experiences learning to teach second-language learners, the researchers collected larger datasets and used a statistical analysis package to analyze survey data, and the resultant findings included descriptive statistics. These survey results were supported with open-ended, qualitative data. For example, one of the study’s findings was “a strong majority of candidates (96%) agreed that an immersion approach alone will not guarantee academic or linguistic success for second language learners.” In narrative explanations, one preservice teacher remarked, “there has to be extra instructional efforts to help their students learn English... they won’t learn English by merely sitting in the classrooms” ( Cho, Rios, Trent, & Mayfield, 2012 , p. 75).

Methodologies on the “art” side of Ellingson’s (2011) continuum, alternatively, “value humanistic, openly subjective knowledge, such as that embodied in stories, poetry, photography, and painting” (p. 599). Analysis and interpretation in these (often more contemporary) methodological approaches strive not for “social scientific truth,” but instead are formulated to “enable us to learn about ourselves, each other, and the world through encountering the unique lens of a person’s (or a group’s) passionate rendering of a reality into a moving, aesthetic expression of meaning” (p. 599). For these “artistic/interpretivists, truths are multiple, fluctuating and ambiguous” (p. 599). Methodologies taking more artistic, subjective approaches to analysis and interpretation include autoethnography, testimonio, performance studies, feminist theorists/researchers, and others from related critical methodological forms of qualitative practice.

As an example, one of us engaged in an artistic inquiry with a group of students in an art class for elementary teachers. We called it “Dreams as Data” and, among the project aims, we wanted to gather participants’ “dreams for education in the future” and display these dreams in an accessible, interactive, artistic display (see Trent, 2002 ). The intent here was not to statistically analyze the dreams/data; instead, it was more universal. We wanted, as Ellingson (2011) noted, to use participant responses in ways that “enable us to learn about ourselves, each other, and the world.” The decision was made to leave responses intact and to share the whole/raw dataset in the artistic display in ways that allowed the viewers to holistically analyze and interpret for themselves. The following text is an excerpt from one response:

Almost a century ago, John Dewey eloquently wrote about the need to imagine and create the education that ALL children deserve, not just the richest, the Whitest, or the easiest to teach. At the dawn of this new century, on some mornings, I wake up fearful that we are further away from this ideal than ever.... Collective action, in a critical, hopeful, joyful, anti-racist and pro-justice spirit, is foremost in my mind as I reflect on and act in my daily work.... Although I realize the constraints on teachers and schools in the current political arena, I do believe in the power of teachers to stand next to, encourage, and believe in the students they teach—in short, to change lives. ( Trent, 2002 , p. 49)

In sum, researchers whom Ellingson (2011) characterizes as being on the science end of the continuum typically use more detailed or “atomistic” strategies to analyze and interpret qualitative data, whereas those toward the artistic end most often employ more holistic strategies. Both of these general approaches to qualitative data analysis and interpretation, atomistic and holistic, will be addressed later in this chapter.

As noted, qualitative researchers attend to data in a wide variety of ways depending on paradigmatic and epistemological beliefs, methodological conventions, and the purpose/aims of the research. These factors impact the kinds of data collected and the ways these data are ultimately analyzed and interpreted. For example, life history or testimonio researchers conduct extensive individual interviews, ethnographers record detailed observational notes, critical theorists may examine documents from pop culture, and ethnomethodologists may collect videotapes of interaction for analysis and interpretation.

In addition to the wide range of data types that are collected by qualitative researchers (and most qualitative researchers collect multiple forms of data), qualitative researchers, again influenced by the factors noted earlier, employ a variety of approaches to analyzing and interpreting data. As mentioned earlier in this article, some advocate for a detailed/atomistic, fine-grained approach to data (see e.g., Miles & Huberman, 1994 ); others, a more broad-based, holistic, “eyeballing” of the data. “Eyeballers reject the more structured approaches to analysis that break down the data into small units and, from the perspective of the eyeballers, destroy the wholeness and some of the meaningfulness of the data” ( Willis, 2007 , p. 298).

Regardless, we assert, as illustrated in Figure 30.1 , that as the process evolves, data collection becomes less prominent later in the process, as interpretation and making sense/meaning of the data becomes more prominent. It is through this emphasis on interpretation that qualitative researchers put their individual imprints on the data, allowing for the emergence of multiple, rich perspectives. This space for interpretation allows researchers the “freedom” Buddy Rich alluded to in his quote about interpreting musical charts. Without this freedom, Rich noted that the process would be simply “mechanical.” Furthermore, allowing space for multiple interpretations nourishes the perspectives of many

As emphasis on data/data collection decreases, emphasis on interpretation increases.

others in the community. Writer and theorist Meg Wheatley explains, “everyone in a complex system has a slightly different interpretation. The more interpretations we gather, the easier it becomes to gain a sense of the whole.”

In addition to the roles methodology and data play in the interpretive process, perhaps the most important is the role of the self/the researcher in the interpretive process. “She is the one who asks the questions. She is the one who conducts the analyses. She is the one who decides who to study and what to study. The researcher is the conduit through which information is gathered and filtered” ( Lichtman, 2006 , p. 16). Eisner (1991) supports the notion of the researcher “self as instrument,” noting that expert researchers don’t simply know what to attend to, but also what to neglect. He describes the researcher’s role in the interpretive process as combining sensibility , the ability to observe and ascertain nuances, with schema , a deep understanding or cognitive framework of the phenomena under study.

Barrett (2007) describes self/researcher roles as “transformations” (p. 418) at multiple points throughout the inquiry process: early in the process, researchers create representations through data generation, conducting observations and interviews and collecting documents and artifacts. Another “transformation occurs when the ‘raw’ data generated in the field are shaped into data records by the researcher. These data records are produced through organizing and reconstructing the researcher’s notes and transcribing audio and video recordings in the form of permanent records that serve as the ‘evidentiary warrants’ of the generated data. The researcher strives to capture aspects of the phenomenal world with fidelity by selecting salient aspects to incorporate into the data record” (p. 418). Transformation continues when the researcher analyzes, codes, categorizes, and explores patterns in the data (the process we call analysis). Transformations also involve interpreting what the data mean and relating these “interpretations to other sources of insight about the phenomena, including findings from related research, conceptual literature, and common experience.... Data analysis and interpretation are often intertwined and rely upon the researcher’s logic, artistry, imagination, clarity, and knowledge of the field under study” ( Barrett, 2007 , p. 418).

We mentioned the often-blended roles of participation and observation earlier in this chapter. The role(s) of the self/researcher are often described as points along a “participant/observer continuum” (see, e.g., Bogdan & Biklen, 2003 ). On the far “observer” end of this continuum, the researcher situates as detached, tries to be inconspicuous (so as not to impact/disrupt the phenomena under study), and approaches the studied context as if viewing it from behind a one-way mirror. On the opposite, “participant” end, the researcher is completely immersed and involved in the context. It would be difficult for an outsider to distinguish between researcher and subjects. For example, “some feminist researchers and some postmodernists take on a political stance as well and have an agenda that places the researcher in an activist posture. These researchers often become quite involved with the individuals they study and try to improve their human condition” ( Lichtman, 2006 , p. 9).

We assert that most researchers fall somewhere between these poles. We believe that complete detachment is both impossible and misguided. In doing so, we, along with many others, acknowledge (and honor) the role of subjectivity, the researcher’s beliefs, opinions, biases, and predispositions. Positivist researchers seeking objective data and accounts either ignore the impact of subjectivity or attempt to drastically diminish/eliminate its impact. Even qualitative researchers have developed methods to avoid researcher subjectivity affecting research data collection, analysis, and interpretation. For example, foundational phenomenologist Husserl (1962/1913) developed the concept of “bracketing,” what Lichtman describes as “trying to identify your own views on the topic and then putting them aside” (2006, p. 13). Like Slotnick and Janesick (2011) , we ultimately claim, “it is impossible to bracket yourself” (p. 1358). Instead, we take a balanced approach, like Eisner, understanding that subjectivity allows researchers to produce the rich, idiosyncratic, insightful, and yet data-based interpretations and accounts of lived experience that accomplish the primary purposes of qualitative inquiry. “Rather than regarding uniformity and standardization as the summum bonum, educational criticism [Eisner’s form of qualitative research] views unique insight as the higher good” ( Eisner, 1991 , p. 35). That said, we also claim that, just because we acknowledge and value the role of researcher subjectivity, researchers are still obligated to ground their findings in reasonable interpretations of the data. Eisner (1991) explains:

This appreciation for personal insight as a source of meaning does not provide a license for freedom. Educational critics must provide evidence and reasons. But they reject the assumption that unique interpretation is a conceptual liability in understanding, and they see the insights secured from multiple views as more attractive than the comforts provided by a single right one. (p. 35)

Connected to this participant/observer continuum is the way the researcher positions him- or herself in relation to the “subjects” of the study. Traditionally, researchers, including early qualitative researchers, anthropologists, and ethnographers, referenced those studied as “subjects.” More recently, qualitative researchers better understand that research should be a reciprocal process in which both researcher and the foci of the research should derive meaningful benefit. Researchers aligned with this thinking frequently use the term “participants” to describe those groups and individuals included in a study. Going a step farther, some researchers view research participants as experts on the studied topic and as equal collaborators in the meaning-making process. In these instances, researchers often use the terms “co-researchers” or “co-investigators.”

The qualitative researcher, then, plays significant roles throughout the inquiry process. These roles include transforming data, collaborating with research participants or co-researchers, determining appropriate points to situate along the participant/observer continuum, and ascribing personal insights, meanings, and interpretations that are both unique and justified with data exemplars. Performing these roles unavoidably impacts and changes the researcher. “Since, in qualitative research the individual is the research instrument through which all data are passed, interpreted, and reported, the scholar’s role is constantly evolving as self evolves” ( Slotnick & Janesick, 2011 , p. 1358).

As we note later, key in all this is for researchers to be transparent about the topics discussed in the preceding section: what methodological conventions have been employed and why? How have data been treated throughout the inquiry to arrive at assertions and findings that may or may not be transferable to other idiosyncratic contexts? And, finally, in what ways has the researcher/self been situated in and impacted the inquiry? Unavoidably, we assert, the self lies at the critical intersection of data and theory, and, as such, two legs of this stool, data and researcher, interact to create the third, theory.

How Do Qualitative Researchers Engage in the Process of Interpretation?

Theorists seem to have a propensity to dichotomize concepts, pulling them apart and placing binary opposites on far ends of conceptual continuums. Qualitative research theorists are no different, and we have already mentioned some of these continua in this chapter. For example, in the last section, we discussed the participant–observer continuum. Earlier, we referenced both Willis’s (2007) conceptualization of “atomistic” versus “holistic” approaches to qualitative analysis and interpretation and Ellingson’s (2011) science–art continuum. Each of these latter two conceptualizations inform “how qualitative researchers engage in the process of interpretation.”

Willis (2007) shares that the purpose of a qualitative project might be explained as “what we expect to gain from research” (p. 288). The purpose, or “what we expect to gain,” then guides and informs the approaches researchers might take to interpretation. Some researchers, typically positivist/postpositivist, conduct studies that aim to test theories about how the world works and/or people behave. These researchers attempt to discover general laws, truths, or relationships that can be generalized. Others, less confident in the ability of research to attain a single, generalizable law or truth, might seek “local theory.” These researchers still seek truths, but “instead of generalizable laws or rules, they search for truths about the local context... to understand what is really happening and then to communicate the essence of this to others” ( Willis, 2007 , p. 291). In both of these purposes, researchers employ atomistic strategies in an inductive process in which researchers “break the data down into small units and then build broader and broader generalizations as the data analysis proceeds” (p. 317). The earlier mentioned processes of analytic induction, constant comparison, and grounded theory fit within this conceptualization of atomistic approaches to interpretation. For example, a line-by-line coding of a transcript might begin an atomistic approach to data analysis.

Alternatively, other researchers pursue distinctly different aims. Researchers with an “objective description” purpose focus on accurately describing the people and context under study. These researchers adhere to standards and practices designed to achieve objectivity, and their approach to interpretation falls between the binary atomistic/holistic distinction.

The purpose of hermeneutic approaches to research is to “understand the perspectives of humans. And because understanding is situational, hermeneutic research tends to look at the details of the context in which the study occurred. The result is generally rich data reports that include multiple perspectives” ( Willis, 2007 , p. 293).

Still other researchers see their purpose as the creation of stories or narratives that utilize “a social process that constructs meaning through interaction... it is an effort to represent in detail the perspectives of participants... whereas description produces one truth about the topic of study, storytelling may generate multiple perspectives, interpretations, and analyses by the researcher and participants” ( Willis, 2007 , p. 295).

In these latter purposes (hermeneutic, storytelling, narrative production), researchers typically employ more holistic strategies. “Holistic approaches tend to leave the data intact and to emphasize that meaning must be derived for a contextual reading of the data rather than the extraction of data segments for detailed analysis” (p. 297). This was the case with the “Dreams as Data” project mentioned earlier.

We understand the propensity to dichotomize, situate concepts as binary opposites, and to create neat continua between these polar descriptors. These sorts of reduction and deconstruction support our understandings and, hopefully, enable us to eventually reconstruct these ideas in meaningful ways. Still, in reality, we realize most of us will, and should, work in the middle of these conceptualizations in fluid ways that allow us to pursue strategies, processes, and theories most appropriate for the research task at hand. As noted, Ellingson (2011) sets up another conceptual continuum, but, like ours, her advice is to “straddle multiple points across the field of qualitative methods” (p. 595). She explains, “I make the case for qualitative methods to be conceptualized as a continuum anchored by art and science, with vast middle spaces that embody infinite possibilities for blending artistic, expository, and social scientific ways of analysis and representation” (p. 595).

We explained at the beginning of this chapter that we view analysis as organizing and summarizing qualitative data, and interpretation as constructing meaning. In this sense, analysis allows us to “describe” the phenomena under study. It enables us to succinctly answer “what” and “how” questions and ensures that our descriptions are grounded in the data collected. Descriptions, however, rarely respond to questions of “why?” Why questions are the domain of interpretation, and, as noted throughout this text, interpretation is complex. “Traditionally, qualitative inquiry has concerned itself with what and how questions... qualitative researchers typically approach why questions cautiously, explanation is tricky business” ( Gubrium & Holstein, 2000 , p. 502). Eisner (1991) describes this distinctive nature of interpretation: “it means that inquirers try to account for [interpretation] what they have given account of ” (p. 35).

Our focus here is on interpretation, but interpretation requires analysis, for without having clear understandings of the data and its characteristics, derived through systematic examination and organization (e.g., coding, memoing, categorizing, etc.), “interpretations” resulting from inquiry will likely be incomplete, uninformed, and inconsistent with the constructed perspectives of the study participants. Fortunately for qualitative researchers, we have many sources that lead us through analytic processes. We earlier mentioned the accepted processes of analytic induction and the constant comparison method. These detailed processes (see e.g., Bogdan & Biklen, 2003 ) combine the inextricably linked activities of analysis and interpretation, with “analysis” more typically appearing as earlier steps in the process and meaning construction—“interpretation”—happening later.

A wide variety of resources support researchers engaged in the processes of analysis and interpretation. Saldaña (2011) , for example, provides a detailed description of coding types and processes. He shows researchers how to use process coding (uses gerunds, “-ing” words to capture action), in vivo coding (uses the actual words of the research participants/subjects), descriptive coding (uses nouns to summarize the data topics), versus coding (uses “vs.” to identify conflicts and power issues), and values coding (identifies participants’ values, attitudes, and/or beliefs). To exemplify some of these coding strategies, we include an excerpt from a transcript of a meeting of a school improvement committee. In this study, the collaborators were focused on building “school community.” This excerpt illustrates the application of a variety of codes described by Saldaña to this text:

To connect and elaborate the ideas developed in coding, Saldaña (2011) suggests researchers categorize the applied codes, write memos to deepen understandings and illuminate additional questions, and identify emergent themes. To begin the categorization process, Saldaña recommends all codes be “classified into similar clusters... once the codes have been classified, a category label is applied to them” (p. 97). So, in continuing with the study of school community example coded here, the researcher might create a cluster/category called: “Value of Collaboration,” and in this category might include the codes, “relationships,” “building community,” and “effective strategies.”

Having coded and categorized a study’s various data forms, a typical next step for researchers is to write “memos” or “analytic memos.” Writing analytic memos allows the researcher(s) to “set in words your interpretation of the data... an analytic memo further articulates your... thinking processes on what things may mean... as the study proceeds, however, initial and substantive analytic memos can be revisited and revised for eventual integration into the report itself” ( Saldaña, 2011 , p. 98). In the study of student teaching from K–12 students’ perspectives ( Trent & Zorko, 2006 ), we noticed throughout our analysis a series of focus group interview quotes coded “names.” The following quote from a high school student is representative of many others:

I think that, ah, they [student teachers] should like know your face and your name because, uh, I don’t like it if they don’t and they’ll just like... cause they’ll blow you off a lot easier if they don’t know, like our new principal is here... he is, like, he always, like, tries to make sure to say hi even to the, like, not popular people if you can call it that, you know, and I mean, yah, and the people that don’t usually socialize a lot, I mean he makes an effort to know them and know their name like so they will cooperate better with him.

Although we didn’t ask the focus groups a specific question about whether or not student teachers knew the K–12 students’ names, the topic came up in every focus group interview. We coded the above excerpt and the others, “knowing names,” and these data were grouped with others under the category “relationships.” In an initial analytic memo about this, the researchers wrote:

STUDENT TEACHING STUDY—MEMO #3 “Knowing Names as Relationship Building” Most groups made unsolicited mentions of student teachers knowing, or not knowing, their names. We haven’t asked students about this, but it must be important to them because it always seems to come up. Students expected student teachers to know their names. When they did, students noticed and seemed pleased. When they didn’t, students seemed disappointed, even annoyed. An elementary student told us that early in the semester, “she knew our names... cause when we rose [sic] our hands, she didn’t have to come and look at our name tags... it made me feel very happy.” A high schooler, expressing displeasure that his student teacher didn’t know students’ names, told us, “They should like know your name because it shows they care about you as a person. I mean, we know their names, so they should take the time to learn ours too.” Another high school student said that even after 3 months, she wasn’t sure the student teacher knew her name. Another student echoed, “same here.” Each of these students asserted that this (knowing students’ names) had impacted their relationship with the student teacher. This high school student focus group stressed that a good relationship, built early, directly impacts classroom interaction and student learning. A student explained it like this: “If you get to know each other, you can have fun with them... they seem to understand you more, you’re more relaxed, and learning seems easier.” Open in new tab Meeting Transcript .  Process Coding .  Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Talking Sharing Building Listening Collaborating Understanding IN VIVO CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Talking about what we want to get out of this Each of us sharing Hearing each of us reflecting Collaboration will be extremely valuable Relationships DESCRIPTIVE CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Open, participatory discussion Identification of effective strategies Collaborative, productive relationships Robust Understandings VERSUS CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Effective vs. Ineffective strategies Positive reflections vs. negative reflections VALUES CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Sharing Building community Reflection Collaboration Relationships Deeper Understandings Meeting Transcript .  Process Coding .  Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Talking Sharing Building Listening Collaborating Understanding IN VIVO CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Talking about what we want to get out of this Each of us sharing Hearing each of us reflecting Collaboration will be extremely valuable Relationships DESCRIPTIVE CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Open, participatory discussion Identification of effective strategies Collaborative, productive relationships Robust Understandings VERSUS CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Effective vs. Ineffective strategies Positive reflections vs. negative reflections VALUES CODING Let’s start talking about what we want to get out of this. What I’d like to hear is each of us sharing what we’re doing relative to this idea of building community. “Here’s what I’m doing. Here’s what worked. Here’s what didn’t work. I’m happy with this. I’m sad with this,” and just hearing each of us reflecting about what we’re doing I think will be interesting. That collaboration will be extremely valuable in terms of not only our relationships with one another, but also understanding the idea of community in more specific and concrete ways. Sharing Building community Reflection Collaboration Relationships Deeper Understandings

As noted in these brief examples, coding, categorizing, and writing memos about a study’s data are all accepted processes for data analysis and allow researchers to begin constructing new understandings and forming interpretations of the studied phenomena. We find the qualitative research literature to be particularly strong in offering support and guidance for researchers engaged in these analytic practices. In addition to those already noted in this chapter, we have found the following resources provide practical, yet theoretically grounded approaches to qualitative data analysis. For more detailed, procedural, or atomistic approaches to data analysis, we direct researchers to Miles and Huberman’s classic 1994 text, Qualitative Data Analysis , and Ryan and Bernard’s (2000) chapter on “Data Management and Analysis Methods.” For analysis and interpretation strategies falling somewhere between the atomistic and holistic poles, we suggest Hesse-Biber and Leavy’s (2011) chapter, “Analysis and Interpretation of Qualitative Data,” in their book, The Practice of Qualitative Research (2nd edition); Lichtman’s chapter, “Making Meaning From Your Data,” in her book Qualitative Research in Education: A User’s Guide; and “Processing Fieldnotes: Coding and Memoing” a chapter in Emerson, Fretz, and Shaw’s (1995) book, Writing Ethnographic Fieldnotes . Each of these sources succinctly describes the processes of data preparation, data reduction, coding and categorizing data, and writing memos about emergent ideas and findings. For more holistic approaches, we have found Denzin and Lincoln’s (2007)   Collecting and Interpreting Qualitative Materials , and Ellis and Bochner’s (2000) chapter “Autoethnography, Personal Narrative, Reflexivity,” to both be very informative.

We have not yet mentioned the use of computer software for data analysis. The use of CAQDAS (Computer Assisted Qualitative Data Analysis Software) has become prevalent. That said, it is beyond the scope of this chapter because, generally, the software is very useful for analysis, but only human researchers can interpret in the ways we describe. Multiple sources are readily available for those interested in exploring computer-assisted analysis. We have found the software to be particularly useful when working with large sets of data.

Even after reviewing the multiple resources for treating data included here, qualitative researchers might still be wondering, “but exactly how do we interpret?” In the remainder of this section, and in the concluding section of this chapter, we more concretely provide responses to this question, and, in closing, propose a framework for researchers to utilize as they engage in the complex, ambiguous, and yet exciting process of constructing meanings and new understandings from qualitative sources.

These meanings and understandings are often presented as theory, but theories in this sense should be viewed more as “guides to perception” as opposed to “devices that lead to the tight control or precise prediction of events” ( Eisner, 1991 , p. 95). Perhaps Erickson’s (1986) concept of “assertions” is a more appropriate aim for qualitative researchers. He claimed that assertions are declarative statements; they include a summary of the new understandings, and they are supported by evidence/data. These assertions are open to revision and are revised when disconfirming evidence requires modification. Assertions, theories, or other explanations resulting from interpretation in research are typically presented as “findings” in written research reports. Belgrave and Smith (2002) emphasize the importance of these interpretations (as opposed to descriptions), “the core of the report is not the events reported by the respondent, but rather the subjective meaning of the reported events for the respondent” (p. 248).

Mills (2007) views interpretation as responding to the question, “So what?” He provides researchers a series of concrete strategies for both analysis and interpretation. Specific to interpretation, Mills suggests a variety of techniques, including the following:

“ Extend the Analysis ”: In doing so, researchers ask additional questions about the research. The data appear to say X, but could it be otherwise? In what ways do the data support emergent finding X? And, in what ways do they not?

“ Connect Findings with Personal Experience ”: Using this technique, researchers share interpretations based on their intimate knowledge of the context, the observed actions of the individuals in the studied context, and the data points that support emerging interpretations, as well as their awareness of discrepant events or outlier data. In a sense, the researcher is saying, “based on my experiences in conducting this study, this is what I make of it all.”

“ Seek the Advice of ‘Critical’ Friends ”: In doing so, researchers utilize trusted colleagues, fellow researchers, experts in the field of study, and others to offer insights, alternative interpretations, and the application of their own unique lenses to a researcher’s initial findings. We especially like this strategy because we acknowledge that, too often, qualitative interpretation is a “solo” affair.

“ Contextualize the Findings in the Literature ”: This allows researchers to compare their interpretations to others writing about and studying the same/similar phenomena. The results of this contextualization may be that the current study’s findings correspond with the findings of other researchers. The results might, alternatively, differ from the findings of other researchers. In either instance, the researcher can highlight his or her unique contributions to our understanding of the topic under study.

“ Turn to Theory” : Mills defines theory as “an analytical and interpretive framework that helps the researcher make sense of ‘what is going on’ in the social setting being studied.” In turning to theory, researchers search for increasing levels of abstraction and move beyond purely descriptive accounts. Connecting to extant or generating new theory enables researchers to link their work to the broader contemporary issues in the field. (p. 136)

Other theorists offer additional advice for researchers engaged in the act of interpretation. Richardson (1995) reminds us to account for the power dynamics in the researcher–researched relationship and notes that, in doing so, we can allow for oppressed and marginalized voices to be heard in context. Bogdan and Biklen (2003) suggest that researchers engaged in interpretation revisit foundational writing about qualitative research, read studies related to the current research, ask evaluative questions (e.g., is what I’m seeing here good or bad?), ask about implications of particular findings/interpretations, think about the audience for interpretations, look for stories and incidents that illustrate a specific finding/interpretation, and attempt to summarize key interpretations in a succinct paragraph. All of these suggestions can be pertinent in certain situations and with particular methodological approaches. In the next and closing section of this chapter, we present a framework for interpretive strategies we believe will support, guide, and be applicable to qualitative researchers across multiple methodologies and paradigms.

In What Ways Can a Framework for Interpretation Strategies Support Qualitative Researchers Across Multiple Methodological and Paradigmatic Views?

The process of qualitative research is often compared to a journey, one without a detailed itinerary and ending, but instead a journey with general direction and aims and yet an open-endedness that adds excitement and thrives on curiosity. Qualitative researchers are travelers. They travel physically to field sites; they travel mentally through various epistemological, theoretical, and methodological grounds; they travel through a series of problem finding, access, data collection, and data analysis processes; and, finally—the topic of this chapter—they travel through the process of making meaning out of all this physical and cognitive travel via interpretation.

Although travel is an appropriate metaphor to describe the journey of qualitative researchers, we’ll also use “travel” to symbolize a framework for qualitative research interpretation strategies. By design, this is a framework that applies across multiple paradigmatic, epistemological, and methodological traditions. The application of this framework is not formulaic or highly prescriptive, it is also not an “anything goes” approach. It falls, and is applicable, between these poles, giving concrete (suggested) direction to qualitative researchers wanting to make the most out of the interpretations that result from their research, and yet allows the necessary flexibility for researchers to employ the methods, theories, and approaches they deem most appropriate to the research problem(s) under study.

TRAVEL, a Comprehensive Approach to Qualitative Interpretation

In using the word “TRAVEL” as a mnemonic device, our aim is to highlight six essential concepts we argue all qualitative researchers should attend to in the interpretive process: Transparency, Reflexivity, Analysis, Validity, Evidence, and Literature. The importance of each is addressed here.

Transparency , as a research concept seems, well... transparent. But, too often, we read qualitative research reports and are left with many questions: How were research participants and the topic of study selected/excluded? How were the data collected, when, and for how long? Who analyzed and interpreted these data? A single researcher? Multiple? What interpretive strategies were employed? Are there data points that substantiate these interpretations/findings? What analytic procedures were used to organize the data prior to making the presented interpretations? In being transparent about data collection, analysis, and interpretation processes, researchers allow reviewers/readers insight into the research endeavor, and this transparency leads to credibility for both researcher and researcher’s claims. Altheide and Johnson (2011) explain, “There is great diversity of qualitative research.... While these approaches differ, they also share an ethical obligation to make public their claims, to show the reader, audience, or consumer why they should be trusted as faithful accounts of some phenomenon” (p. 584). This includes, they note, articulating “what the different sources of data were, how they were interwoven, and... how subsequent interpretations and conclusions are more or less closely tied to the various data... the main concern is that the connection be apparent, and to the extent possible, transparent” (p. 590).

In the “Dreams as Data” art and research project noted earlier, transparency was addressed in multiple ways. Readers of the project write-up were informed that interpretations resulting from the study, framed as “themes,” were a result of collaborative analysis that included insights from both students and instructor. Viewers of the art installation/data display had the rare opportunity to see all participant responses. In other words, viewers had access to the entire raw dataset (see Trent, 2002 ). More frequently, we encounter only research “findings” already distilled, analyzed, and interpreted in research accounts, often by a single researcher. Allowing research consumers access to the data to interpret for themselves in the “dreams” project was an intentional attempt at transparency.

Reflexivity , the second of our concepts for interpretive researcher consideration, has garnered a great deal of attention in qualitative research literature. Some have called this increased attention the “reflexive turn” (see e.g., Denzin & Lincoln, 2004 :

Although you can find many meanings for the term reflexivity, it is usually associated with a critical reflection on the practice and process of research and the role of the researcher. It concerns itself with the impact of the researcher on the system and the system on the researcher. It acknowledges the mutual relationships between the researcher and who and what is studied... by acknowledging the role of the self in qualitative research, the researcher is able to sort through biases and think about how they affect various aspects of the research, especially interpretation of meanings. ( Lichtman, 2006 , pp. 206–207)

As with transparency, attending to reflexivity allows researchers to attach credibility to presented findings. Providing a reflexive account of researcher subjectivity and the interactions of this subjectivity within the research process is a way for researchers to communicate openly with their audience. Instead of trying to exhume inherent bias from the process, qualitative researchers share with readers the value of having a specific, idiosyncratic positionality. As a result, situated, contextualized interpretations are viewed as an asset, as opposed to a liability.

LaBanca (2011) , acknowledging the often solitary nature of qualitative research, calls for researchers to engage others in the reflexive process. Like many other researchers, LaBanca utilizes a researcher journal to chronicle reflexive thoughts, explorations and understandings, but he takes this a step farther. Realizing the value of others’ input, LaBanca posts his reflexive journal entries on a blog (what he calls an “online reflexivity blog”) and invites critical friends, other researchers, and interested members of the community to audit his reflexive moves, providing insights, questions, and critique that inform his research and study interpretations.

We agree this is a novel approach worth considering. We, too, understand that multiple interpreters will undoubtedly produce multiple interpretations, a richness of qualitative research. So, we suggest researchers consider bringing others in before the production of the report. This could be fruitful in multiple stages of the inquiry process, but especially so in the complex, idiosyncratic processes of reflexivity and interpretation. We are both educators and educational researchers. Historically, each of these roles has tended to be constructed as an isolated endeavor, the solitary teacher, the solo researcher/fieldworker. As noted earlier and in the “analysis” section that follows, introducing collaborative processes to what has often been a solitary activity offers much promise for generating rich interpretations that benefit from multiple perspectives.

Being consciously reflexive throughout our practice as researchers has benefitted us in many ways. In a study of teacher education curricula designed to prepare preservice teachers to support second-language learners, we realized hard truths that caused us to reflect on and adapt our own practices as teacher educators. Reflexivity can inform a researcher at all parts of the inquiry, even in early stages. For example, one of us was beginning a study of instructional practices in an elementary school. The communicated methods of the study indicated that the researcher would be largely an observer. Early fieldwork revealed that the researcher became much more involved as a participant than anticipated. Deep reflection and writing about the classroom interactions allowed the researcher to realize that the initial purpose of the research was not being accomplished, and the researcher believed he was having a negative impact on the classroom culture. Reflexivity in this instance prompted the researcher to leave the field and abandon the project as it was just beginning. Researchers should plan to openly engage in reflexive activities, including writing about their ongoing reflections and subjectivities. Including excerpts of this writing in research account supports our earlier recommendation of transparency.

Early in this chapter, for the purposes of discussion and examination, we defined analysis as “summarizing and organizing” data in a qualitative study, and interpretation as “finding” or “making” meaning. Although our focus has been on interpretation as the primary topic here, the importance of good analysis cannot be underestimated for, without it, resultant interpretations are likely incomplete and potentially uninformed. Comprehensive analysis puts researchers in a position to be deeply familiar with collected data and to organize these data into forms that lead to rich, unique interpretations, and yet to interpretations clearly connected to data exemplars. Although we find it advantageous to examine analysis and interpretation as different but related practices, in reality, the lines blur as qualitative researchers engage in these recursive processes.

We earlier noted our affinity for a variety of approaches to analysis (see e.g., Lichtman, 2006 ; Saldaña, 2011 ; or Hesse-Biber & Leavy 2011 ). Emerson, Fretz, and Shaw (1995) present a grounded approach to qualitative data analysis: in early stages, researchers engage in a close, line-by-line reading of data/collected text and accompany this reading with open coding , a process of categorizing and labeling the inquiry data. Next, researchers write initial memos to describe and organize the data under analysis. These analytic phases allow the researcher(s) to prepare, organize, summarize, and understand the data, in preparation for the more interpretive processes of focused coding and the writing up of interpretations and themes in the form of integrative memos .

Similarly, Mills (2007) provides guidance on the process of analysis for qualitative action researchers. His suggestions for organizing and summarizing data include coding (labeling data and looking for patterns), asking key questions about the study data (who, what, where, when, why, and how), developing concept maps (graphic organizers that show initial organization and relationships in the data), and stating what’s missing by articulating what data are not present (pp. 124–132).

Many theorists, like Emerson, Fretz, and Shaw (1995) and Mills (2007) noted here, provide guidance for individual researchers engaged in individual data collection, analysis, and interpretation; others, however, invite us to consider the benefits of collaboratively engaging in these processes through the use of collaborative research and analysis teams. Paulus, Woodside, and Ziegler (2008) wrote about their experiences in collaborative qualitative research: “Collaborative research often refers to collaboration among the researcher and the participants. Few studies investigate the collaborative process among researchers themselves” (p. 226).

Paulus, Woodside, and Ziegler (2008) claim that the collaborative process “challenged and transformed our assumptions about qualitative research” (p. 226). Engaging in reflexivity, analysis, and interpretation as a collaborative enabled these researchers to reframe their views about the research process, finding that the process was much more recursive, as opposed to following a linear progression. They also found that cooperatively analyzing and interpreting data yielded “collaboratively constructed meanings” as opposed to “individual discoveries.” And finally, instead of the traditional “individual products” resulting from solo research, collaborative interpretation allowed researchers to participate in an “ongoing conversation” (p. 226).

These researchers explain that engaging in collaborative analysis and interpretation of qualitative data challenged their previously held assumptions. They note, “through collaboration, procedures are likely to be transparent to the group and can, therefore, be made public. Data analysis benefits from an iterative, dialogic, and collaborative process because thinking is made explicit in a way that is difficult to replicate as a single researcher” ( Paulus, Woodside, & Ziegler, 2008 , p. 236). They share that during the collaborative process, “we constantly checked our interpretation against the text, the context, prior interpretations, and each other’s interpretations” (p. 234).

We, too, have engaged in analysis similar to these described processes, including working on research teams. We encourage other researchers to find processes that fit with the methodology and data of a particular study, use the techniques and strategies most appropriate, and then cite to the utilized authority to justify the selected path. We urge traditionally solo researchers to consider trying a collaborative approach. Generally, we suggest researchers be familiar with a wide repertoire of practices. In doing so, they’ll be in better positions to select and use strategies most appropriate for their studies and data. Succinctly preparing, organizing, categorizing, and summarizing data sets the researcher(s) up to construct meaningful interpretations in the forms of assertions, findings, themes, and theories.

Researchers want their findings to be sound, backed by evidence, justifiable, and to accurately represent the phenomena under study. In short, researchers seek validity for their work. We assert that qualitative researchers should attend to validity concepts as a part of their interpretive practices. We have previously written and theorized about validity, and, in doing so, we have highlighted and labeled what we consider to be two distinctly different approaches, transactional and transformational ( Cho & Trent, 2006 ). We define transactional validity in qualitative research as an interactive process occurring among the researcher, the researched, and the collected data, one that is aimed at achieving a relatively higher level of accuracy. Techniques, methods, and/or strategies are employed during the conduct of the inquiry. These techniques, such as member checking and triangulation, are seen as a medium with which to ensure an accurate reflection of reality (or, at least, participants’ constructions of reality). Lincoln and Guba’s (1985) widely known notion of trustworthiness in “naturalistic inquiry” is grounded in this approach. In seeking trustworthiness, researchers attend to research credibility, transferability, dependability, and confirmability. Validity approaches described by Maxwell (1992) as “descriptive” and “interpretive” also proceed in the usage of transactional processes.

For example, in the write-up of a study on the facilitation of teacher research, one of us ( Trent, 2012 , p. 44) wrote about the use of transactional processes: “‘Member checking is asking the members of the population being studied for their reaction to the findings’ ( Sagor, 2000 , p. 136). Interpretations and findings of this research, in draft form, were shared with teachers (for member checking) on multiple occasions throughout the study. Additionally, teachers reviewed and provided feedback on the final draft of this article.” This member checking led to changes in some resultant interpretations (called findings in this particular study) and to adaptations of others that shaped these findings in ways that made them both richer and more contextualized.

Alternatively, in transformational approaches, validity is not so much something that can be achieved solely by way of certain techniques. Transformationalists assert that because traditional or positivist inquiry is no longer seen as an absolute means to truth in the realm of human science, alternative notions of validity should be considered to achieve social justice, deeper understandings, broader visions, and other legitimate aims of qualitative research. In this sense, it is the ameliorative aspects of the research that achieve (or don’t achieve) its validity. Validity is determined by the resultant actions prompted by the research endeavor.

Lather (1993) , Richardson (1997) , and others (e.g., Lenzo, 1995 ; Scheurich, 1996 ) propose a transgressive approach to validity that emphasizes a higher degree of self-reflexivity. For example, Lather has proposed a “catalytic validity” described as “the degree to which the research empowers and emancipates the research subjects” ( Scheurich, 1996 , p. 4). Beverley (2000 , p. 556) has proposed “testimonio” as a qualitative research strategy. These first-person narratives find their validity in their ability to raise consciousness and thus provoke political action to remedy problems of oppressed peoples (e.g., poverty, marginality, exploitation).

We, too, have pursued research with transformational aims. In the earlier mentioned study of preservice teachers’ experiences learning to teach second-language learners ( Cho, Rios, Trent, & Mayfield, 2012 ), our aims were to empower faculty members, evolve the curriculum, and, ultimately, better serve preservice teachers so that they might better serve English-language learners in their classrooms. As program curricula and activities have changed as a result, we claim a degree of transformational validity for this research.

Important, then, for qualitative researchers throughout the inquiry, but especially when engaged in the process of interpretation, is to determine the type(s) of validity applicable to the study. What are the aims of the study? Providing an “accurate” account of studied phenomena? Empowering participants to take action for themselves and others? The determination of this purpose will, in turn, inform researchers’ analysis and interpretation of data. Understanding and attending to the appropriate validity criteria will bolster researcher claims to meaningful findings and assertions.

Regardless of purpose or chosen validity considerations, qualitative research depends on evidence . Researchers in different qualitative methodologies rely on different types of evidence to support their claims. Qualitative researchers typically utilize a variety of forms of evidence including texts (written notes, transcripts, images, etc.), audio and video recordings, cultural artifacts, documents related to the inquiry, journal entries, and field notes taken during observations of social contexts and interactions. “Evidence is essential to justification, and justification takes the form of an argument about the merit(s) of a given claim. It is generally accepted that no evidence is conclusive or unassailable (and hence, no argument is foolproof). Thus, evidence must often be judged for its credibility, and that typically means examining its source and the procedures by which it was produced [thus the need for transparency discussed earlier]” ( Schwandt, 2001 , p. 82).

Qualitative researchers distinguish evidence from facts. Evidence and facts are similar but not identical. We can often agree on facts, e.g., there is a rock, it is harder than cotton candy. Evidence involves an assertion that some facts are relevant to an argument or claim about a relationship. Since a position in an argument is likely tied to an ideological or even epistemological position, evidence is not completely bound by facts, but it is more problematic and subject to disagreement. ( Altheide & Johnson, 2011 , p. 586)

Inquirers should make every attempt to link evidence to claims (or findings, interpretations, assertions, conclusions, etc.). There are many strategies for making these connections. Induction involves accumulating multiple data points to infer a general conclusion. Confirmation entails directly linking evidence to resultant interpretations. Testability/falsifiability means illustrating that evidence does not necessarily contradict the claim/interpretation, and so increases the credibility of the claim ( Schwandt, 2001 ). In the “learning to teach second-language learners” study, for example, a study finding ( Cho, Rios, Trent, & Mayfield, 2012 , p. 77) was that “as a moral claim , candidates increasingly [in higher levels of the teacher education program] feel more responsible and committed to ELLs [English language learners].” We supported this finding with a series of data points that included the following preservice teacher response: “It is as much the responsibility of the teacher to help teach second-language learners the English language as it is our responsibility to teach traditional English speakers to read or correctly perform math functions.” Claims supported by evidence allow readers to see for themselves and to both examine researcher assertions in tandem with evidence and to form further interpretations of their own.

Some postmodernists reject the notion that qualitative interpretations are arguments based on evidence. Instead, they argue that qualitative accounts are not intended to faithfully represent that experience, but instead are designed to evoke some feelings or reactions in the reader of the account ( Schwandt, 2001 ). We argue that, even in these instances where transformational validity concerns take priority over transactional processes, evidence still matters. Did the assertions accomplish the evocative aims? What evidence/arguments were used to evoke these reactions? Does the presented claim correspond with the study’s evidence? Is the account inclusive? In other words, does it attend to all evidence or selectively compartmentalize some data while capitalizing on other evidentiary forms?

Researchers, we argue, should be both transparent and reflexive about these questions and, regardless of research methodology or purpose, should share with readers of the account their evidentiary moves and aims. Altheide and Johnson (2011) call this an “evidentiary narrative” and explain:

Ultimately, evidence is bound up with our identity in a situation.... An “evidentiary narrative” emerges from a reconsideration of how knowledge and belief systems in everyday life are tied to epistemic communities that provide perspectives, scenarios, and scripts that reflect symbolic and social moral orders. An “evidentiary narrative” symbolically joins an actor, an audience, a point of view (definition of a situation), assumptions, and a claim about a relationship between two or more phenomena. If any of these factors are not part of the context of meaning for a claim, it will not be honored, and thus, not seen as evidence. (p. 686)

In sum, readers/consumers of a research account deserve to know how evidence was treated and viewed in an inquiry. They want and should be aware of accounts that aim to evoke versus represent, and then they can apply their own criteria (including the potential transferability to their situated context). Renowned ethnographer and qualitative research theorist Harry Wolcott (1990) urges researchers to “let readers ‘see’ for themselves” by providing more detail rather than less and by sharing primary data/evidence to support interpretations. In the end, readers don’t expect perfection. Writer Eric Liu (2010) explains, “we don’t expect flawless interpretation. We expect good faith. We demand honesty.”

Last, in this journey through concepts we assert are pertinent to researchers engaged in interpretive processes, we include attention to the “ literature .” In discussing “literature,” qualitative researchers typically mean publications about the prior research conducted on topics aligned with or related to a study. Most often, this research/literature is reviewed and compiled by researchers in a section of the research report titled, “literature review.” It is here we find others’ studies, methods, and theories related to our topics of study, and it is here we hope the assertions and theories that result from our studies will someday reside.

We acknowledge the value of being familiar with research related to topics of study. This familiarity can inform multiple phases of the inquiry process. Understanding the extant knowledge base can inform research questions and topic selection, data collection and analysis plans, and the interpretive process. In what ways do the interpretations from this study correspond with other research conducted on this topic? Do findings/interpretations corroborate, expand, or contradict other researchers’ interpretations of similar phenomena? In any of these scenarios (correspondence, expansion, contradiction), new findings and interpretations from a study add to and deepen the knowledge base, or literature, on a topic of investigation.

For example, in our literature review for the study of student teaching, we quickly determined that the knowledge base and extant theories related to the student teaching experience was immense, but also quickly realized that few if any studies had examined student teaching from the perspective of the K–12 students who had the student teachers. This focus on the literature related to our topic of student teaching prompted us to embark on a study that would fill a gap in this literature: most of the knowledge base focused on the experiences and learning of the student teachers themselves. Our study then, by focusing on the K–12 students’ perspectives, added literature/theories/assertions to a previously untapped area. The “literature” in this area (at least we’d like to think) is now more robust as a result.

In another example, a research team ( Trent et al., 2003 ) focused on institutional diversity efforts, mined the literature, found an appropriate existing (a priori) set of theories/assertions, and then used this existing theoretical framework from the literature as a framework to analyze data; in this case, a variety of institutional activities related to diversity.

Conducting a literature review to explore extant theories on a topic of study can serve a variety of purposes. As evidenced in these examples, consulting the literature/extant theory can reveal gaps in the literature. A literature review might also lead researchers to existing theoretical frameworks that support analysis and interpretation of their data (as in the use of the a priori framework example). Finally, a review of current theories related to a topic of inquiry might confirm that much theory already exists, but that further study may add to, bolster, and/or elaborate on the current knowledge base.

Guidance for researchers conducting literature reviews is plentiful. Lichtman (2006) suggests researchers conduct a brief literature review, begin research, and then update and modify the literature review as the inquiry unfolds. She suggests reviewing a wide range of related materials (not just scholarly journals) and additionally suggests researchers attend to literature on methodology, not just the topic of study. She also encourages researchers to bracket and write down thoughts on the research topic as they review the literature, and, important for this chapter, she suggests researchers “integrate your literature review throughout your writing rather than using a traditional approach of placing it in a separate chapter [or section]” (p. 105).

We agree that the power of a literature review to provide context for a study can be maximized when this information isn’t compartmentalized apart from a study’s findings. Integrating (or at least revisiting) reviewed literature juxtaposed alongside findings can illustrate how new interpretations add to an evolving story. Eisenhart (1998) expands the traditional conception of the literature review and discusses the concept of an “interpretive review.” By taking this interpretive approach, Eisenhart claims that reviews, alongside related interpretations/findings on a specific topic, have the potential to allow readers to see the studied phenomena in entirely new ways, through new lenses, revealing heretofore unconsidered perspectives. Reviews that offer surprising and enriching perspectives on meanings and circumstances “shake things up, break down boundaries, and cause things (or thinking) to expand” (p. 394). Coupling reviews of this sort with current interpretations will “give us stories that startle us with what we have failed to notice” (p. 395).

In reviews of research studies, it can certainly be important to evaluate the findings in light of established theories and methods [the sorts of things typically included in literature reviews]. However, it also seems important to ask how well the studies disrupt conventional assumptions and help us to reconfigure new, more inclusive, and more promising perspectives on human views and actions. From an interpretivist perspective, it would be most important to review how well methods and findings permit readers to grasp the sense of unfamiliar perspectives and actions. ( Eisenhart, 1998 , p. 397)

And so, our journey through qualitative research interpretation and the selected concepts we’ve treated in this chapter nears an end, an end in the written text, but a hopeful beginning of multiple new conversations among ourselves and in concert with other qualitative researchers. Our aims here have been to circumscribe interpretation in qualitative research; emphasize the importance of interpretation in achieving the aims of the qualitative project; discuss the interactions of methodology, data, and the researcher/self as these concepts and theories intertwine with interpretive processes; describe some concrete ways that qualitative inquirers engage the process of interpretation; and, finally, to provide a framework of interpretive strategies that may serve as a guide for ourselves and other researchers.

In closing, we note that this “travel” framework, construed as a journey to be undertaken by researchers engaged in the interpretive process, is not designed to be rigid or prescriptive, but instead is designed to be a flexible set of concepts that will inform researchers across multiple epistemological, methodological, and theoretical paradigms. We chose the concepts of transparency, reflexivity, analysis, validity, evidence, and literature (TRAVEL) because they are applicable to the infinite journeys undertaken by qualitative researchers who have come before and to those who will come after us. As we journeyed through our interpretations of interpretation, we have discovered new things about ourselves and our work. We hope readers also garner insights that enrich their interpretive excursions. Happy travels to all— Bon Voyage !

Altheide, D. , & Johnson, J. M. ( 2011 ). Reflections on interpretive adequacy in qualitative research. In N. M. Denzin & Y. S. Lincoln (Eds.), The Sage handbook of qualitative research (pp. 595–610). Thousand Oaks, CA: Sage.

Google Scholar

Google Preview

Barrett, T. ( 2000 ). Criticizing art: Understanding the contemporary . New York: McGraw Hill.

Barrett, J. ( 2007 ). The researcher as instrument: learning to conduct qualitative research through analyzing and interpreting a choral rehearsal.   Music Education Research , 9 (3), 417–433.

Belgrave, L. L. , & Smith, K. J. ( 2002 ). Negotiated validity in collaborative ethnography. In N. M. Denzin & Y. S. Lincoln (Eds.), The qualitative inquiry reader (pp. 233–255). Thousand Oaks, CA: Sage.

Beverly, J. ( 2000 ). Testimonio, subalternity, and narrative authority. In N. M. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research, second edition (pp. 555–566). Thousand Oaks, CA: Sage.

Bogdan, R. C. , & Biklen, S. K. ( 2003 ). Qualitative research for education: An introduction to theories and methods . Boston: Allyn and Bacon.

Cho, J. , Rios, F. , Trent, A. , & Mayfield, K. ( 2012 ). Integrating language diversity into teacher education curricula in a rural context: Candidates’ developmental perspectives and understandings.   Teacher Education Quarterly , 39 (2), 63–85.

Cho, J. , & Trent, A. ( 2006 ). Validity in qualitative research revisited.   QR—Qualitative Research Journal , 6 (3), 319–340.

Denzin, N. M. , & Lincoln, Y. S. (Eds.). ( 2004 ). Handbook of qualitative research . Newbury Park, CA: Sage.

Denzin, N. M. , & Lincoln, Y. S. ( 2007 ). Collecting and interpreting qualitative materials . Thousand Oaks, CA: Sage.

Eisenhart, M. ( 1998 ). On the subject of interpretive reviews.   Review of Educational Research , 68 (4), 391–393.

Eisner, E. ( 1991 ). The enlightened eye: Qualitative inquiry and the enhancement of educational practice . New York: Macmillan.

Ellingson, L. L. ( 2011 ). Analysis and representation across the continuum. In N. M. Denzin & Y. S. Lincoln (Eds.), The Sage handbook of qualitative research (pp. 595–610). Thousand Oaks, CA: Sage.

Ellis, C. , & Bochner, A. P. ( 2000 ). Autoethnography, personal narrative, reflexivity: Researcher as subject. In N. M. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research , second edition (pp. 733–768). Thousand Oaks, CA: Sage.

Emerson, R. , Fretz, R. , & Shaw, L. ( 1995 ). Writing ethnographic fieldwork . Chicago: University of Chicago Press.

Erickson, F. ( 1986 ). Qualitative methods in research in teaching and learning. In M. C. Wittrock Ed.), Handbook of research on teaching (3rd ed., pp 119–161). New York: Macmillan.

Glaser, B. ( 1965 ). The constant comparative method of qualitative analysis.   Social Problems , 12 (4), 436–445.

Gubrium, J. F. , & Holstein, J. A. ( 2000 ). Analyzing interpretive practice. In N. M. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research , second edition (pp. 487–508). Thousand Oaks, CA: Sage.

Hesse-Biber, S. N. , & Leavy, P. ( 2011 ). The practice of qualitative research (2nd ed.). Thousand Oaks, CA: Sage.

Hubbard, R. S. , & Power, B. M. ( 2003 ). The art of classroom inquiry: A handbook for teacher researchers . Portsmouth, NH: Heinemann.

Husserl, E. ( 1962 ). Ideas: general introduction to pure phenomenology . ( W. R. Boyce Gibson , Trans.). London, New York: Collier, Macmillan. (Original work published 1913)

LaBanca, F. ( 2011 ). Online dynamic asynchronous audit strategy for reflexivity in the qualitative paradigm.   Qualitative Report , 16 (4), 1160–1171.

Lather, P. ( 1993 ). Fertile obsession: Validity after poststructuralism.   Sociological Quarterly , 34 (4), 673–693.

Lenzo, K. ( 1995 ). Validity and self reflexivity meet poststructuralism: Scientific ethos and the transgressive self.   Educational Researcher , 24 (4), 17–23, 45.

Lichtman, M. ( 2006 ). Qualitative research in education: A user’s guide . Thousand Oaks, CA: Sage.

Liu, E. (2010). The real meaning of balls and strikes. Retrieved September 20, 2012, from http://www.huffingtonpost.com/eric-liu/the-real-meaning-of-balls_b_660915.html

Lincoln, Y. S. , & Guba, E. G. ( 1985 ). Naturalistic inquiry . Beverly Hills, CA: Sage.

Maxwell, J. ( 1992 ). Understanding and validity in qualitative research.   Harvard Educational Review , 62 (3), 279–300.

Miles, M. B. , & Huberman, A. M. ( 1994 ). Qualitative data analysis . Thousand Oaks, CA: Sage.

Mills, G. E. ( 2007 ). Action research: A guide for the teacher researcher . Upper Saddle River, NJ: Pearson.

Paulus, T. , Woodside, M. , & Ziegler, M. ( 2008 ). Extending the conversation: Qualitative research as dialogic collaborative process.   Qualitative Report , 13 (2), 226–243.

Richardson, L. ( 1995 ). Writing stories: Co-authoring the “sea monster,” a writing story.   Qualitative Inquiry , 1 , 189–203.

Richardson, L. ( 1997 ). Fields of play: Constructing an academic life . New Brunswick, NJ: Rutgers University Press.

Ryan, G. W. , & Bernard, H. R. ( 2000 ). Data management and analysis methods. In N. M. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research , second edition (pp. 769–802). Thousand Oaks, CA: Sage.

Sagor, R. ( 2000 ). Guiding school improvement with action research . Alexandria, VA: ASCD.

Saldaña, J. ( 2011 ). Fundamentals of qualitative research . New York: Oxford University Press.

Scheurich, J. ( 1996 ). The masks of validity: A deconstructive investigation.   Qualitative Studies in Education , 9 (1), 49–60.

Slotnick, R. C. , & Janesick, V. J. ( 2011 ). Conversations on method: Deconstructing policy through the researcher reflective journal.   Qualitative Report , 16 (5), 1352–1360.

Schwandt, T. A. ( 2001 ). Dictionary of qualitative inquiry . Thousand Oaks, CA: Sage.

Trent, A. ( 2002 ). Dreams as data: Art installation as heady research,   Teacher Education Quarterly , 29 (4), 39–51.

Trent, A. ( 2012 ). Action research on action research: A facilitator’s account.   Action Learning and Action Research Journal , 18 (1), 35–67.

Trent, A. , Rios, F. , Antell, J. , Berube, W. , Bialostok, S. , Cardona, D. , et al. ( 2003 ). Problems and possibilities in the pursuit of diversity: An institutional analysis.   Equity & Excellence , 36(3), 213–224.

Trent, A. , & Zorko, L. ( 2006 ). Listening to students: “New” perspectives on student teaching.   Teacher Education & Practice , 19 (1), 55–70.

Willis, J. W. ( 2007 ). Foundations of qualitative research: Interpretive and critical approaches . Thousand Oaks, CA: Sage.

Wolcott, H. ( 1990 ). On seeking-and rejecting-validity in qualitative research. In E. Eisner & A. Peshkin (Eds.), Qualitative inquiry in education: The continuing debate (pp. 121–152). New York: Teachers College Press.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Leeds Beckett University

Skills for Learning : Research Skills

Data analysis is an ongoing process that should occur throughout your research project. Suitable data-analysis methods must be selected when you write your research proposal. The nature of your data (i.e. quantitative or qualitative) will be influenced by your research design and purpose. The data will also influence the analysis methods selected.

We run interactive workshops to help you develop skills related to doing research, such as data analysis, writing literature reviews and preparing for dissertations. Find out more on the Skills for Learning Workshops page.

We have online academic skills modules within MyBeckett for all levels of university study. These modules will help your academic development and support your success at LBU. You can work through the modules at your own pace, revisiting them as required. Find out more from our FAQ What academic skills modules are available?

Quantitative data analysis

Broadly speaking, 'statistics' refers to methods, tools and techniques used to collect, organise and interpret data. The goal of statistics is to gain understanding from data. Therefore, you need to know how to:

  • Produce data – for example, by handing out a questionnaire or doing an experiment.
  • Organise, summarise, present and analyse data.
  • Draw valid conclusions from findings.

There are a number of statistical methods you can use to analyse data. Choosing an appropriate statistical method should follow naturally, however, from your research design. Therefore, you should think about data analysis at the early stages of your study design. You may need to consult a statistician for help with this.

Tips for working with statistical data

  • Plan so that the data you get has a good chance of successfully tackling the research problem. This will involve reading literature on your subject, as well as on what makes a good study.
  • To reach useful conclusions, you need to reduce uncertainties or 'noise'. Thus, you will need a sufficiently large data sample. A large sample will improve precision. However, this must be balanced against the 'costs' (time and money) of collection.
  • Consider the logistics. Will there be problems in obtaining sufficient high-quality data? Think about accuracy, trustworthiness and completeness.
  • Statistics are based on random samples. Consider whether your sample will be suited to this sort of analysis. Might there be biases to think about?
  • How will you deal with missing values (any data that is not recorded for some reason)? These can result from gaps in a record or whole records being missed out.
  • When analysing data, start by looking at each variable separately. Conduct initial/exploratory data analysis using graphical displays. Do this before looking at variables in conjunction or anything more complicated. This process can help locate errors in the data and also gives you a 'feel' for the data.
  • Look out for patterns of 'missingness'. They are likely to alert you if there’s a problem. If the 'missingness' is not random, then it will have an impact on the results.
  • Be vigilant and think through what you are doing at all times. Think critically. Statistics are not just mathematical tricks that a computer sorts out. Rather, analysing statistical data is a process that the human mind must interpret!

Top tips! Try inventing or generating the sort of data you might get and see if you can analyse it. Make sure that your process works before gathering actual data. Think what the output of an analytic procedure will look like before doing it for real.

(Note: it is actually difficult to generate realistic data. There are fraud-detection methods in place to identify data that has been fabricated. So, remember to get rid of your practice data before analysing the real stuff!)

Statistical software packages

Software packages can be used to analyse and present data. The most widely used ones are SPSS and NVivo.

SPSS is a statistical-analysis and data-management package for quantitative data analysis. Click on ‘ How do I install SPSS? ’ to learn how to download SPSS to your personal device. SPSS can perform a wide variety of statistical procedures. Some examples are:

  • Data management (i.e. creating subsets of data or transforming data).
  • Summarising, describing or presenting data (i.e. mean, median and frequency).
  • Looking at the distribution of data (i.e. standard deviation).
  • Comparing groups for significant differences using parametric (i.e. t-test) and non-parametric (i.e. Chi-square) tests.
  • Identifying significant relationships between variables (i.e. correlation).

NVivo can be used for qualitative data analysis. It is suitable for use with a wide range of methodologies. Click on ‘ How do I access NVivo ’ to learn how to download NVivo to your personal device. NVivo supports grounded theory, survey data, case studies, focus groups, phenomenology, field research and action research.

  • Process data such as interview transcripts, literature or media extracts, and historical documents.
  • Code data on screen and explore all coding and documents interactively.
  • Rearrange, restructure, extend and edit text, coding and coding relationships.
  • Search imported text for words, phrases or patterns, and automatically code the results.

Qualitative data analysis

Miles and Huberman (1994) point out that there are diverse approaches to qualitative research and analysis. They suggest, however, that it is possible to identify 'a fairly classic set of analytic moves arranged in sequence'. This involves:

  • Affixing codes to a set of field notes drawn from observation or interviews.
  • Noting reflections or other remarks in the margins.
  • Sorting/sifting through these materials to identify: a) similar phrases, relationships between variables, patterns and themes and b) distinct differences between subgroups and common sequences.
  • Isolating these patterns/processes and commonalties/differences. Then, taking them out to the field in the next wave of data collection.
  • Highlighting generalisations and relating them to your original research themes.
  • Taking the generalisations and analysing them in relation to theoretical perspectives.

        (Miles and Huberman, 1994.)

Patterns and generalisations are usually arrived at through a process of analytic induction (see above points 5 and 6). Qualitative analysis rarely involves statistical analysis of relationships between variables. Qualitative analysis aims to gain in-depth understanding of concepts, opinions or experiences.

Presenting information

There are a number of different ways of presenting and communicating information. The particular format you use is dependent upon the type of data generated from the methods you have employed.

Here are some appropriate ways of presenting information for different types of data:

Bar charts: These   may be useful for comparing relative sizes. However, they tend to use a large amount of ink to display a relatively small amount of information. Consider a simple line chart as an alternative.

Pie charts: These have the benefit of indicating that the data must add up to 100%. However, they make it difficult for viewers to distinguish relative sizes, especially if two slices have a difference of less than 10%.

Other examples of presenting data in graphical form include line charts and  scatter plots .

Qualitative data is more likely to be presented in text form. For example, using quotations from interviews or field diaries.

  • Plan ahead, thinking carefully about how you will analyse and present your data.
  • Think through possible restrictions to resources you may encounter and plan accordingly.
  • Find out about the different IT packages available for analysing your data and select the most appropriate.
  • If necessary, allow time to attend an introductory course on a particular computer package. You can book SPSS and NVivo workshops via MyHub .
  • Code your data appropriately, assigning conceptual or numerical codes as suitable.
  • Organise your data so it can be analysed and presented easily.
  • Choose the most suitable way of presenting your information, according to the type of data collected. This will allow your information to be understood and interpreted better.

Primary, secondary and tertiary sources

Information sources are sometimes categorised as primary, secondary or tertiary sources depending on whether or not they are ‘original’ materials or data. For some research projects, you may need to use primary sources as well as secondary or tertiary sources. However the distinction between primary and secondary sources is not always clear and depends on the context. For example, a newspaper article might usually be categorised as a secondary source. But it could also be regarded as a primary source if it were an article giving a first-hand account of a historical event written close to the time it occurred.

  • Primary sources
  • Secondary sources
  • Tertiary sources
  • Grey literature

Primary sources are original sources of information that provide first-hand accounts of what is being experienced or researched. They enable you to get as close to the actual event or research as possible. They are useful for getting the most contemporary information about a topic.

Examples include diary entries, newspaper articles, census data, journal articles with original reports of research, letters, email or other correspondence, original manuscripts and archives, interviews, research data and reports, statistics, autobiographies, exhibitions, films, and artists' writings.

Some information will be available on an Open Access basis, freely accessible online. However, many academic sources are paywalled, and you may need to login as a Leeds Beckett student to access them. Where Leeds Beckett does not have access to a source, you can use our  Request It! Service .

Secondary sources interpret, evaluate or analyse primary sources. They're useful for providing background information on a topic, or for looking back at an event from a current perspective. The majority of your literature searching will probably be done to find secondary sources on your topic.

Examples include journal articles which review or interpret original findings, popular magazine articles commenting on more serious research, textbooks and biographies.

The term tertiary sources isn't used a great deal. There's overlap between what might be considered a secondary source and a tertiary source. One definition is that a tertiary source brings together secondary sources.

Examples include almanacs, fact books, bibliographies, dictionaries and encyclopaedias, directories, indexes and abstracts. They can be useful for introductory information or an overview of a topic in the early stages of research.

Depending on your subject of study, grey literature may be another source you need to use. Grey literature includes technical or research reports, theses and dissertations, conference papers, government documents, white papers, and so on.

Artificial intelligence tools

Before using any generative artificial intelligence or paraphrasing tools in your assessments, you should check if this is permitted on your course.

If their use is permitted on your course, you must  acknowledge any use of generative artificial intelligence tools  such as ChatGPT or paraphrasing tools (e.g., Grammarly, Quillbot, etc.), even if you have only used them to generate ideas for your assessments or for proofreading.

  • Academic Integrity Module in MyBeckett
  • Assignment Calculator
  • Building on Feedback
  • Disability Advice
  • Essay X-ray tool
  • International Students' Academic Introduction
  • Manchester Academic Phrasebank
  • Quote, Unquote
  • Skills and Subject Suppor t
  • Turnitin Grammar Checker

{{You can add more boxes below for links specific to this page [this note will not appear on user pages] }}

  • Research Methods Checklist
  • Sampling Checklist

Skills for Learning FAQs

Library & Student Services

0113 812 1000

  • University Disclaimer
  • Accessibility
  • AI & NLP
  • Churn & Loyalty
  • Customer Experience
  • Customer Journeys
  • Customer Metrics
  • Feedback Analysis
  • Product Experience
  • Product Updates
  • Sentiment Analysis
  • Surveys & Feedback Collection
  • Try Thematic

Welcome to the community

how to write analysis and interpretation of data in research

Qualitative Data Analysis: Step-by-Step Guide (Manual vs. Automatic)

When we conduct qualitative methods of research, need to explain changes in metrics or understand people's opinions, we always turn to qualitative data. Qualitative data is typically generated through:

  • Interview transcripts
  • Surveys with open-ended questions
  • Contact center transcripts
  • Texts and documents
  • Audio and video recordings
  • Observational notes

Compared to quantitative data, which captures structured information, qualitative data is unstructured and has more depth. It can answer our questions, can help formulate hypotheses and build understanding.

It's important to understand the differences between quantitative data & qualitative data . But unfortunately, analyzing qualitative data is difficult. While tools like Excel, Tableau and PowerBI crunch and visualize quantitative data with ease, there are a limited number of mainstream tools for analyzing qualitative data . The majority of qualitative data analysis still happens manually.

That said, there are two new trends that are changing this. First, there are advances in natural language processing (NLP) which is focused on understanding human language. Second, there is an explosion of user-friendly software designed for both researchers and businesses. Both help automate the qualitative data analysis process.

In this post we want to teach you how to conduct a successful qualitative data analysis. There are two primary qualitative data analysis methods; manual & automatic. We will teach you how to conduct the analysis manually, and also, automatically using software solutions powered by NLP. We’ll guide you through the steps to conduct a manual analysis, and look at what is involved and the role technology can play in automating this process.

More businesses are switching to fully-automated analysis of qualitative customer data because it is cheaper, faster, and just as accurate. Primarily, businesses purchase subscriptions to feedback analytics platforms so that they can understand customer pain points and sentiment.

Overwhelming quantity of feedback

We’ll take you through 5 steps to conduct a successful qualitative data analysis. Within each step we will highlight the key difference between the manual, and automated approach of qualitative researchers. Here's an overview of the steps:

The 5 steps to doing qualitative data analysis

  • Gathering and collecting your qualitative data
  • Organizing and connecting into your qualitative data
  • Coding your qualitative data
  • Analyzing the qualitative data for insights
  • Reporting on the insights derived from your analysis

What is Qualitative Data Analysis?

Qualitative data analysis is a process of gathering, structuring and interpreting qualitative data to understand what it represents.

Qualitative data is non-numerical and unstructured. Qualitative data generally refers to text, such as open-ended responses to survey questions or user interviews, but also includes audio, photos and video.

Businesses often perform qualitative data analysis on customer feedback. And within this context, qualitative data generally refers to verbatim text data collected from sources such as reviews, complaints, chat messages, support centre interactions, customer interviews, case notes or social media comments.

How is qualitative data analysis different from quantitative data analysis?

Understanding the differences between quantitative & qualitative data is important. When it comes to analyzing data, Qualitative Data Analysis serves a very different role to Quantitative Data Analysis. But what sets them apart?

Qualitative Data Analysis dives into the stories hidden in non-numerical data such as interviews, open-ended survey answers, or notes from observations. It uncovers the ‘whys’ and ‘hows’ giving a deep understanding of people’s experiences and emotions.

Quantitative Data Analysis on the other hand deals with numerical data, using statistics to measure differences, identify preferred options, and pinpoint root causes of issues.  It steps back to address questions like "how many" or "what percentage" to offer broad insights we can apply to larger groups.

In short, Qualitative Data Analysis is like a microscope,  helping us understand specific detail. Quantitative Data Analysis is like the telescope, giving us a broader perspective. Both are important, working together to decode data for different objectives.

Qualitative Data Analysis methods

Once all the data has been captured, there are a variety of analysis techniques available and the choice is determined by your specific research objectives and the kind of data you’ve gathered.  Common qualitative data analysis methods include:

Content Analysis

This is a popular approach to qualitative data analysis. Other qualitative analysis techniques may fit within the broad scope of content analysis. Thematic analysis is a part of the content analysis.  Content analysis is used to identify the patterns that emerge from text, by grouping content into words, concepts, and themes. Content analysis is useful to quantify the relationship between all of the grouped content. The Columbia School of Public Health has a detailed breakdown of content analysis .

Narrative Analysis

Narrative analysis focuses on the stories people tell and the language they use to make sense of them.  It is particularly useful in qualitative research methods where customer stories are used to get a deep understanding of customers’ perspectives on a specific issue. A narrative analysis might enable us to summarize the outcomes of a focused case study.

Discourse Analysis

Discourse analysis is used to get a thorough understanding of the political, cultural and power dynamics that exist in specific situations.  The focus of discourse analysis here is on the way people express themselves in different social contexts. Discourse analysis is commonly used by brand strategists who hope to understand why a group of people feel the way they do about a brand or product.

Thematic Analysis

Thematic analysis is used to deduce the meaning behind the words people use. This is accomplished by discovering repeating themes in text. These meaningful themes reveal key insights into data and can be quantified, particularly when paired with sentiment analysis . Often, the outcome of thematic analysis is a code frame that captures themes in terms of codes, also called categories. So the process of thematic analysis is also referred to as “coding”. A common use-case for thematic analysis in companies is analysis of customer feedback.

Grounded Theory

Grounded theory is a useful approach when little is known about a subject. Grounded theory starts by formulating a theory around a single data case. This means that the theory is “grounded”. Grounded theory analysis is based on actual data, and not entirely speculative. Then additional cases can be examined to see if they are relevant and can add to the original grounded theory.

Methods of qualitative data analysis; approaches and techniques to qualitative data analysis

Challenges of Qualitative Data Analysis

While Qualitative Data Analysis offers rich insights, it comes with its challenges. Each unique QDA method has its unique hurdles. Let’s take a look at the challenges researchers and analysts might face, depending on the chosen method.

  • Time and Effort (Narrative Analysis): Narrative analysis, which focuses on personal stories, demands patience. Sifting through lengthy narratives to find meaningful insights can be time-consuming, requires dedicated effort.
  • Being Objective (Grounded Theory): Grounded theory, building theories from data, faces the challenges of personal biases. Staying objective while interpreting data is crucial, ensuring conclusions are rooted in the data itself.
  • Complexity (Thematic Analysis): Thematic analysis involves identifying themes within data, a process that can be intricate. Categorizing and understanding themes can be complex, especially when each piece of data varies in context and structure. Thematic Analysis software can simplify this process.
  • Generalizing Findings (Narrative Analysis): Narrative analysis, dealing with individual stories, makes drawing broad challenging. Extending findings from a single narrative to a broader context requires careful consideration.
  • Managing Data (Thematic Analysis): Thematic analysis involves organizing and managing vast amounts of unstructured data, like interview transcripts. Managing this can be a hefty task, requiring effective data management strategies.
  • Skill Level (Grounded Theory): Grounded theory demands specific skills to build theories from the ground up. Finding or training analysts with these skills poses a challenge, requiring investment in building expertise.

Benefits of qualitative data analysis

Qualitative Data Analysis (QDA) is like a versatile toolkit, offering a tailored approach to understanding your data. The benefits it offers are as diverse as the methods. Let’s explore why choosing the right method matters.

  • Tailored Methods for Specific Needs: QDA isn't one-size-fits-all. Depending on your research objectives and the type of data at hand, different methods offer unique benefits. If you want emotive customer stories, narrative analysis paints a strong picture. When you want to explain a score, thematic analysis reveals insightful patterns
  • Flexibility with Thematic Analysis: thematic analysis is like a chameleon in the toolkit of QDA. It adapts well to different types of data and research objectives, making it a top choice for any qualitative analysis.
  • Deeper Understanding, Better Products: QDA helps you dive into people's thoughts and feelings. This deep understanding helps you build products and services that truly matches what people want, ensuring satisfied customers
  • Finding the Unexpected: Qualitative data often reveals surprises that we miss in quantitative data. QDA offers us new ideas and perspectives, for insights we might otherwise miss.
  • Building Effective Strategies: Insights from QDA are like strategic guides. They help businesses in crafting plans that match people’s desires.
  • Creating Genuine Connections: Understanding people’s experiences lets businesses connect on a real level. This genuine connection helps build trust and loyalty, priceless for any business.

How to do Qualitative Data Analysis: 5 steps

Now we are going to show how you can do your own qualitative data analysis. We will guide you through this process step by step. As mentioned earlier, you will learn how to do qualitative data analysis manually , and also automatically using modern qualitative data and thematic analysis software.

To get best value from the analysis process and research process, it’s important to be super clear about the nature and scope of the question that’s being researched. This will help you select the research collection channels that are most likely to help you answer your question.

Depending on if you are a business looking to understand customer sentiment, or an academic surveying a school, your approach to qualitative data analysis will be unique.

Once you’re clear, there’s a sequence to follow. And, though there are differences in the manual and automatic approaches, the process steps are mostly the same.

The use case for our step-by-step guide is a company looking to collect data (customer feedback data), and analyze the customer feedback - in order to improve customer experience. By analyzing the customer feedback the company derives insights about their business and their customers. You can follow these same steps regardless of the nature of your research. Let’s get started.

Step 1: Gather your qualitative data and conduct research (Conduct qualitative research)

The first step of qualitative research is to do data collection. Put simply, data collection is gathering all of your data for analysis. A common situation is when qualitative data is spread across various sources.

Classic methods of gathering qualitative data

Most companies use traditional methods for gathering qualitative data: conducting interviews with research participants, running surveys, and running focus groups. This data is typically stored in documents, CRMs, databases and knowledge bases. It’s important to examine which data is available and needs to be included in your research project, based on its scope.

Using your existing qualitative feedback

As it becomes easier for customers to engage across a range of different channels, companies are gathering increasingly large amounts of both solicited and unsolicited qualitative feedback.

Most organizations have now invested in Voice of Customer programs , support ticketing systems, chatbot and support conversations, emails and even customer Slack chats.

These new channels provide companies with new ways of getting feedback, and also allow the collection of unstructured feedback data at scale.

The great thing about this data is that it contains a wealth of valubale insights and that it’s already there! When you have a new question about user behavior or your customers, you don’t need to create a new research study or set up a focus group. You can find most answers in the data you already have.

Typically, this data is stored in third-party solutions or a central database, but there are ways to export it or connect to a feedback analysis solution through integrations or an API.

Utilize untapped qualitative data channels

There are many online qualitative data sources you may not have considered. For example, you can find useful qualitative data in social media channels like Twitter or Facebook. Online forums, review sites, and online communities such as Discourse or Reddit also contain valuable data about your customers, or research questions.

If you are considering performing a qualitative benchmark analysis against competitors - the internet is your best friend, and review analysis is a great place to start. Gathering feedback in competitor reviews on sites like Trustpilot, G2, Capterra, Better Business Bureau or on app stores is a great way to perform a competitor benchmark analysis.

Customer feedback analysis software often has integrations into social media and review sites, or you could use a solution like DataMiner to scrape the reviews.

G2.com reviews of the product Airtable. You could pull reviews from G2 for your analysis.

Step 2: Connect & organize all your qualitative data

Now you all have this qualitative data but there’s a problem, the data is unstructured. Before feedback can be analyzed and assigned any value, it needs to be organized in a single place. Why is this important? Consistency!

If all data is easily accessible in one place and analyzed in a consistent manner, you will have an easier time summarizing and making decisions based on this data.

The manual approach to organizing your data

The classic method of structuring qualitative data is to plot all the raw data you’ve gathered into a spreadsheet.

Typically, research and support teams would share large Excel sheets and different business units would make sense of the qualitative feedback data on their own. Each team collects and organizes the data in a way that best suits them, which means the feedback tends to be kept in separate silos.

An alternative and a more robust solution is to store feedback in a central database, like Snowflake or Amazon Redshift .

Keep in mind that when you organize your data in this way, you are often preparing it to be imported into another software. If you go the route of a database, you would need to use an API to push the feedback into a third-party software.

Computer-assisted qualitative data analysis software (CAQDAS)

Traditionally within the manual analysis approach (but not always), qualitative data is imported into CAQDAS software for coding.

In the early 2000s, CAQDAS software was popularised by developers such as ATLAS.ti, NVivo and MAXQDA and eagerly adopted by researchers to assist with the organizing and coding of data.  

The benefits of using computer-assisted qualitative data analysis software:

  • Assists in the organizing of your data
  • Opens you up to exploring different interpretations of your data analysis
  • Allows you to share your dataset easier and allows group collaboration (allows for secondary analysis)

However you still need to code the data, uncover the themes and do the analysis yourself. Therefore it is still a manual approach.

The user interface of CAQDAS software 'NVivo'

Organizing your qualitative data in a feedback repository

Another solution to organizing your qualitative data is to upload it into a feedback repository where it can be unified with your other data , and easily searchable and taggable. There are a number of software solutions that act as a central repository for your qualitative research data. Here are a couple solutions that you could investigate:  

  • Dovetail: Dovetail is a research repository with a focus on video and audio transcriptions. You can tag your transcriptions within the platform for theme analysis. You can also upload your other qualitative data such as research reports, survey responses, support conversations, and customer interviews. Dovetail acts as a single, searchable repository. And makes it easier to collaborate with other people around your qualitative research.
  • EnjoyHQ: EnjoyHQ is another research repository with similar functionality to Dovetail. It boasts a more sophisticated search engine, but it has a higher starting subscription cost.

Organizing your qualitative data in a feedback analytics platform

If you have a lot of qualitative customer or employee feedback, from the likes of customer surveys or employee surveys, you will benefit from a feedback analytics platform. A feedback analytics platform is a software that automates the process of both sentiment analysis and thematic analysis . Companies use the integrations offered by these platforms to directly tap into their qualitative data sources (review sites, social media, survey responses, etc.). The data collected is then organized and analyzed consistently within the platform.

If you have data prepared in a spreadsheet, it can also be imported into feedback analytics platforms.

Once all this rich data has been organized within the feedback analytics platform, it is ready to be coded and themed, within the same platform. Thematic is a feedback analytics platform that offers one of the largest libraries of integrations with qualitative data sources.

Some of qualitative data integrations offered by Thematic

Step 3: Coding your qualitative data

Your feedback data is now organized in one place. Either within your spreadsheet, CAQDAS, feedback repository or within your feedback analytics platform. The next step is to code your feedback data so we can extract meaningful insights in the next step.

Coding is the process of labelling and organizing your data in such a way that you can then identify themes in the data, and the relationships between these themes.

To simplify the coding process, you will take small samples of your customer feedback data, come up with a set of codes, or categories capturing themes, and label each piece of feedback, systematically, for patterns and meaning. Then you will take a larger sample of data, revising and refining the codes for greater accuracy and consistency as you go.

If you choose to use a feedback analytics platform, much of this process will be automated and accomplished for you.

The terms to describe different categories of meaning (‘theme’, ‘code’, ‘tag’, ‘category’ etc) can be confusing as they are often used interchangeably.  For clarity, this article will use the term ‘code’.

To code means to identify key words or phrases and assign them to a category of meaning. “I really hate the customer service of this computer software company” would be coded as “poor customer service”.

How to manually code your qualitative data

  • Decide whether you will use deductive or inductive coding. Deductive coding is when you create a list of predefined codes, and then assign them to the qualitative data. Inductive coding is the opposite of this, you create codes based on the data itself. Codes arise directly from the data and you label them as you go. You need to weigh up the pros and cons of each coding method and select the most appropriate.
  • Read through the feedback data to get a broad sense of what it reveals. Now it’s time to start assigning your first set of codes to statements and sections of text.
  • Keep repeating step 2, adding new codes and revising the code description as often as necessary.  Once it has all been coded, go through everything again, to be sure there are no inconsistencies and that nothing has been overlooked.
  • Create a code frame to group your codes. The coding frame is the organizational structure of all your codes. And there are two commonly used types of coding frames, flat, or hierarchical. A hierarchical code frame will make it easier for you to derive insights from your analysis.
  • Based on the number of times a particular code occurs, you can now see the common themes in your feedback data. This is insightful! If ‘bad customer service’ is a common code, it’s time to take action.

We have a detailed guide dedicated to manually coding your qualitative data .

Example of a hierarchical coding frame in qualitative data analysis

Using software to speed up manual coding of qualitative data

An Excel spreadsheet is still a popular method for coding. But various software solutions can help speed up this process. Here are some examples.

  • CAQDAS / NVivo - CAQDAS software has built-in functionality that allows you to code text within their software. You may find the interface the software offers easier for managing codes than a spreadsheet.
  • Dovetail/EnjoyHQ - You can tag transcripts and other textual data within these solutions. As they are also repositories you may find it simpler to keep the coding in one platform.
  • IBM SPSS - SPSS is a statistical analysis software that may make coding easier than in a spreadsheet.
  • Ascribe - Ascribe’s ‘Coder’ is a coding management system. Its user interface will make it easier for you to manage your codes.

Automating the qualitative coding process using thematic analysis software

In solutions which speed up the manual coding process, you still have to come up with valid codes and often apply codes manually to pieces of feedback. But there are also solutions that automate both the discovery and the application of codes.

Advances in machine learning have now made it possible to read, code and structure qualitative data automatically. This type of automated coding is offered by thematic analysis software .

Automation makes it far simpler and faster to code the feedback and group it into themes. By incorporating natural language processing (NLP) into the software, the AI looks across sentences and phrases to identify common themes meaningful statements. Some automated solutions detect repeating patterns and assign codes to them, others make you train the AI by providing examples. You could say that the AI learns the meaning of the feedback on its own.

Thematic automates the coding of qualitative feedback regardless of source. There’s no need to set up themes or categories in advance. Simply upload your data and wait a few minutes. You can also manually edit the codes to further refine their accuracy.  Experiments conducted indicate that Thematic’s automated coding is just as accurate as manual coding .

Paired with sentiment analysis and advanced text analytics - these automated solutions become powerful for deriving quality business or research insights.

You could also build your own , if you have the resources!

The key benefits of using an automated coding solution

Automated analysis can often be set up fast and there’s the potential to uncover things that would never have been revealed if you had given the software a prescribed list of themes to look for.

Because the model applies a consistent rule to the data, it captures phrases or statements that a human eye might have missed.

Complete and consistent analysis of customer feedback enables more meaningful findings. Leading us into step 4.

Step 4: Analyze your data: Find meaningful insights

Now we are going to analyze our data to find insights. This is where we start to answer our research questions. Keep in mind that step 4 and step 5 (tell the story) have some overlap . This is because creating visualizations is both part of analysis process and reporting.

The task of uncovering insights is to scour through the codes that emerge from the data and draw meaningful correlations from them. It is also about making sure each insight is distinct and has enough data to support it.

Part of the analysis is to establish how much each code relates to different demographics and customer profiles, and identify whether there’s any relationship between these data points.

Manually create sub-codes to improve the quality of insights

If your code frame only has one level, you may find that your codes are too broad to be able to extract meaningful insights. This is where it is valuable to create sub-codes to your primary codes. This process is sometimes referred to as meta coding.

Note: If you take an inductive coding approach, you can create sub-codes as you are reading through your feedback data and coding it.

While time-consuming, this exercise will improve the quality of your analysis. Here is an example of what sub-codes could look like.

Example of sub-codes

You need to carefully read your qualitative data to create quality sub-codes. But as you can see, the depth of analysis is greatly improved. By calculating the frequency of these sub-codes you can get insight into which  customer service problems you can immediately address.

Correlate the frequency of codes to customer segments

Many businesses use customer segmentation . And you may have your own respondent segments that you can apply to your qualitative analysis. Segmentation is the practise of dividing customers or research respondents into subgroups.

Segments can be based on:

  • Demographic
  • And any other data type that you care to segment by

It is particularly useful to see the occurrence of codes within your segments. If one of your customer segments is considered unimportant to your business, but they are the cause of nearly all customer service complaints, it may be in your best interest to focus attention elsewhere. This is a useful insight!

Manually visualizing coded qualitative data

There are formulas you can use to visualize key insights in your data. The formulas we will suggest are imperative if you are measuring a score alongside your feedback.

If you are collecting a metric alongside your qualitative data this is a key visualization. Impact answers the question: “What’s the impact of a code on my overall score?”. Using Net Promoter Score (NPS) as an example, first you need to:

  • Calculate overall NPS
  • Calculate NPS in the subset of responses that do not contain that theme
  • Subtract B from A

Then you can use this simple formula to calculate code impact on NPS .

Visualizing qualitative data: Calculating the impact of a code on your score

You can then visualize this data using a bar chart.

You can download our CX toolkit - it includes a template to recreate this.

Trends over time

This analysis can help you answer questions like: “Which codes are linked to decreases or increases in my score over time?”

We need to compare two sequences of numbers: NPS over time and code frequency over time . Using Excel, calculate the correlation between the two sequences, which can be either positive (the more codes the higher the NPS, see picture below), or negative (the more codes the lower the NPS).

Now you need to plot code frequency against the absolute value of code correlation with NPS. Here is the formula:

Analyzing qualitative data: Calculate which codes are linked to increases or decreases in my score

The visualization could look like this:

Visualizing qualitative data trends over time

These are two examples, but there are more. For a third manual formula, and to learn why word clouds are not an insightful form of analysis, read our visualizations article .

Using a text analytics solution to automate analysis

Automated text analytics solutions enable codes and sub-codes to be pulled out of the data automatically. This makes it far faster and easier to identify what’s driving negative or positive results. And to pick up emerging trends and find all manner of rich insights in the data.

Another benefit of AI-driven text analytics software is its built-in capability for sentiment analysis, which provides the emotive context behind your feedback and other qualitative textual data therein.

Thematic provides text analytics that goes further by allowing users to apply their expertise on business context to edit or augment the AI-generated outputs.

Since the move away from manual research is generally about reducing the human element, adding human input to the technology might sound counter-intuitive. However, this is mostly to make sure important business nuances in the feedback aren’t missed during coding. The result is a higher accuracy of analysis. This is sometimes referred to as augmented intelligence .

Codes displayed by volume within Thematic. You can 'manage themes' to introduce human input.

Step 5: Report on your data: Tell the story

The last step of analyzing your qualitative data is to report on it, to tell the story. At this point, the codes are fully developed and the focus is on communicating the narrative to the audience.

A coherent outline of the qualitative research, the findings and the insights is vital for stakeholders to discuss and debate before they can devise a meaningful course of action.

Creating graphs and reporting in Powerpoint

Typically, qualitative researchers take the tried and tested approach of distilling their report into a series of charts, tables and other visuals which are woven into a narrative for presentation in Powerpoint.

Using visualization software for reporting

With data transformation and APIs, the analyzed data can be shared with data visualisation software, such as Power BI or Tableau , Google Studio or Looker. Power BI and Tableau are among the most preferred options.

Visualizing your insights inside a feedback analytics platform

Feedback analytics platforms, like Thematic, incorporate visualisation tools that intuitively turn key data and insights into graphs.  This removes the time consuming work of constructing charts to visually identify patterns and creates more time to focus on building a compelling narrative that highlights the insights, in bite-size chunks, for executive teams to review.

Using a feedback analytics platform with visualization tools means you don’t have to use a separate product for visualizations. You can export graphs into Powerpoints straight from the platforms.

Two examples of qualitative data visualizations within Thematic

Conclusion - Manual or Automated?

There are those who remain deeply invested in the manual approach - because it’s familiar, because they’re reluctant to spend money and time learning new software, or because they’ve been burned by the overpromises of AI.  

For projects that involve small datasets, manual analysis makes sense. For example, if the objective is simply to quantify a simple question like “Do customers prefer X concepts to Y?”. If the findings are being extracted from a small set of focus groups and interviews, sometimes it’s easier to just read them

However, as new generations come into the workplace, it’s technology-driven solutions that feel more comfortable and practical. And the merits are undeniable.  Especially if the objective is to go deeper and understand the ‘why’ behind customers’ preference for X or Y. And even more especially if time and money are considerations.

The ability to collect a free flow of qualitative feedback data at the same time as the metric means AI can cost-effectively scan, crunch, score and analyze a ton of feedback from one system in one go. And time-intensive processes like focus groups, or coding, that used to take weeks, can now be completed in a matter of hours or days.

But aside from the ever-present business case to speed things up and keep costs down, there are also powerful research imperatives for automated analysis of qualitative data: namely, accuracy and consistency.

Finding insights hidden in feedback requires consistency, especially in coding.  Not to mention catching all the ‘unknown unknowns’ that can skew research findings and steering clear of cognitive bias.

Some say without manual data analysis researchers won’t get an accurate “feel” for the insights. However, the larger data sets are, the harder it is to sort through the feedback and organize feedback that has been pulled from different places.  And, the more difficult it is to stay on course, the greater the risk of drawing incorrect, or incomplete, conclusions grows.

Though the process steps for qualitative data analysis have remained pretty much unchanged since psychologist Paul Felix Lazarsfeld paved the path a hundred years ago, the impact digital technology has had on types of qualitative feedback data and the approach to the analysis are profound.  

If you want to try an automated feedback analysis solution on your own qualitative data, you can get started with Thematic .

how to write analysis and interpretation of data in research

Community & Marketing

Tyler manages our community of CX, insights & analytics professionals. Tyler's goal is to help unite insights professionals around common challenges.

We make it easy to discover the customer and product issues that matter.

Unlock the value of feedback at scale, in one platform. Try it for free now!

  • Questions to ask your Feedback Analytics vendor
  • How to end customer churn for good
  • Scalable analysis of NPS verbatims
  • 5 Text analytics approaches
  • How to calculate the ROI of CX

Our experts will show you how Thematic works, how to discover pain points and track the ROI of decisions. To access your free trial, book a personal demo today.

Recent posts

When two major storms wreaked havoc on Auckland and Watercare’s infrastructurem the utility went through a CX crisis. With a massive influx of calls to their support center, Thematic helped them get inisghts from this data to forge a new approach to restore services and satisfaction levels.

Become a qualitative theming pro! Creating a perfect code frame is hard, but thematic analysis software makes the process much easier.

Qualtrics is one of the most well-known and powerful Customer Feedback Management platforms. But even so, it has limitations. We recently hosted a live panel where data analysts from two well-known brands shared their experiences with Qualtrics, and how they extended this platform’s capabilities. Below, we’ll share the

Banner

Research Guide: Data analysis and reporting findings

  • Postgraduate Online Training subject guide This link opens in a new window
  • Open Educational Resources (OERs)
  • Library support
  • Research ideas
  • You and your supervisor
  • Researcher skills
  • Research Data Management This link opens in a new window
  • Literature review
  • Plagiarism This link opens in a new window
  • Research Methods
  • Data analysis and reporting findings
  • Statistical support
  • Writing support
  • Researcher visibility
  • Conferences and Presentations
  • Postgraduate Forums
  • Soft skills development
  • Emotional support
  • The Commons Informer (blog)
  • Research Tip Archives
  • RC Newsletter Archives
  • Evaluation Forms
  • Editing FAQs

Data analysis and findings

Data analysis is the most crucial part of any research. Data analysis summarizes collected data. It involves the interpretation of data gathered through the use of analytical and logical reasoning to determine patterns, relationships or trends. 

Data Analysis Checklist

Cleaning  data

* Did you capture and code your data in the right manner?

*Do you have all data or missing data?

* Do you have enough observations?

* Do you have any outliers? If yes, what is the remedy for outlier?

* Does your data have the potential to answer your questions?

Analyzing data

* Visualize your data, e.g. charts, tables, and graphs, to mention a few.

*  Identify patterns, correlations, and trends

* Test your hypotheses

* Let your data tell a story

Reports the results

* Communicate and interpret the results

* Conclude and recommend

* Your targeted audience must understand your results

* Use more datasets and samples

* Use accessible and understandable data analytical tool

* Do not delegate your data analysis

* Clean data to confirm that they are complete and free from errors

* Analyze cleaned data

* Understand your results

* Keep in mind who will be reading your results and present it in a way that they will understand it

* Share the results with the supervisor oftentimes

Past presentations

  • PhD Writing Retreat - Analysing_Fieldwork_Data by Cori Wielenga A clear and concise presentation on the ‘now what’ and ‘so what’ of data collection and analysis - compiled and originally presented by Cori Wielenga.

Online Resources

how to write analysis and interpretation of data in research

  • Qualitative analysis of interview data: A step-by-step guide
  • Qualitative Data Analysis - Coding & Developing Themes

Recommended Quantitative Data Analysis books

how to write analysis and interpretation of data in research

Recommended Qualitative Data Analysis books

how to write analysis and interpretation of data in research

  • << Previous: Data collection techniques
  • Next: Statistical support >>
  • Last Updated: May 23, 2024 3:47 PM
  • URL: https://library.up.ac.za/c.php?g=485435

U.S. flag

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

THE CDC FIELD EPIDEMIOLOGY MANUAL

Analyzing and Interpreting Data

Richard C. Dicker

  • Planning the Analysis
  • Analyzing Data from a Field Investigation
  • Summary Exposure Tables

Stratified Analysis

  • Confounding
  • Effect Modification
  • Dose-Response
  • Interpreting Data from a Field Investigation

Field investigations are usually conducted to identify the factors that increased a person’s risk for a disease or other health outcome. In certain field investigations, identifying the cause is sufficient; if the cause can be eliminated, the problem is solved. In other investigations, the goal is to quantify the association between exposure (or any population characteristic) and the health outcome to guide interventions or advance knowledge. Both types of field investigations require suitable, but not necessarily sophisticated, analytic methods. This chapter describes the strategy for planning an analysis, methods for conducting the analysis, and guidelines for interpreting the results.

A thoughtfully planned and carefully executed analysis is as crucial for a field investigation as it is for a protocol-based study. Planning is necessary to ensure that the appropriate hypotheses will be considered and that the relevant data will be collected, recorded, managed, analyzed, and interpreted to address those hypotheses. Therefore, the time to decide what data to collect and how to analyze those data is before you design your questionnaire, not after you have collected the data.

An analysis plan is a document that guides how you progress from raw data to the final report. It describes where you are starting (data sources and data sets), how you will look at and analyze the data, and where you need to finish (final report). It lays out the key components of the analysis in a logical sequence and provides a guide to follow during the actual analysis.

An analysis plan includes some or most of the content listed in Box 8.1 . Some of the listed elements are more likely to appear in an analysis plan for a protocol-based planned study, but even an outbreak investigation should include the key components in a more abbreviated analysis plan, or at least in a series of table shells.

  • List of the research questions or hypotheses
  • Source(s) of data
  • Description of population or groups (inclusion or exclusion criteria)
  • Source of data or data sets, particularly for secondary data analysis or population denominators
  • Type of study
  • How data will be manipulated
  • Data sets to be used or merged
  • New variables to be created
  • Key variables (attach data dictionary of all variables)
  • Demographic and exposure variables
  • Outcome or endpoint variables
  • Stratification variables (e.g., potential confounders or effect modifiers)
  • How variables will be analyzed (e.g., as a continuous variable or grouped in categories)
  • How to deal with missing values
  • Order of analysis (e.g., frequency distributions, two-way tables, stratified analysis, dose-response, or group analysis)
  • Measures of occurrence, association, tests of significance, or confidence intervals to be used
  • Table shells to be used in analysis
  • Tables shells to be included in final report
  • Research question or hypotheses . The analysis plan usually begins with the research questions or hypotheses you plan to address. Well-reasoned research questions or hypotheses lead directly to the variables that need to be analyzed and the methods of analysis. For example, the question, “What caused the outbreak of gastroenteritis?” might be a suitable objective for a field investigation, but it is not a specific research question. A more specific question—for example, “Which foods were more likely to have been consumed by case-patients than by controls?”—indicates that key variables will be food items and case–control status and that the analysis method will be a two-by-two table for each food.
  • Analytic strategies . Different types of studies (e.g., cohort, case–control, or cross-sectional) are analyzed with different measures and methods. Therefore, the analysis strategy must be consistent with how the data will be collected. For example, data from a simple retrospective cohort study should be analyzed by calculating and comparing attack rates among exposure groups. Data from a case–control study must be analyzed by comparing exposures among case-patients and controls, and the data must account for matching in the analysis if matching was used in the design. Data from a cross-sectional study or survey might need to incorporate weights or design effects in the analysis.The analysis plan should specify which variables are most important—exposures and outcomes of interest, other known risk factors, study design factors (e.g., matching variables), potential confounders, and potential effect modifiers.
  • Data dictionary . A data dictionary is a document that provides key information about each variable. Typically, a data dictionary lists each variable’s name, a brief description, what type of variable it is (e.g., numeric, text, or date), allowable values, and an optional comment. Data dictionaries can be organized in different ways, but a tabular format with one row per variable, and columns for name, description, type, legal value, and comment is easy to organize (see example in Table 8.1 from an outbreak investigation of oropharyngeal tularemia [ 1 ]). A supplement to the data dictionary might include a copy of the questionnaire with the variable names written next to each question.
  • Get to know your data . Plan to get to know your data by reviewing (1) the frequency of responses and descriptive statistics for each variable; (2) the minimum, maximum, and average values for each variable; (3) whether any variables have the same response for every record; and (4) whether any variables have many or all missing values. These patterns will influence how you analyze these variables or drop them from the analysis altogether.
  • Table shells . The next step in developing the analysis plan is designing the table shells. A table shell, sometimes called a dummy table , is a table (e.g., frequency distribution or two-by-two table) that is titled and fully labeled but contains no data. The numbers will be filled in as the analysis progresses. Table shells provide a guide to the analysis, so their sequence should proceed in logical order from simple (e.g., descriptive epidemiology) to more complex (e.g., analytic epidemiology) ( Box 8.2 ). Each table shell should indicate which measures (e.g., attack rates, risk ratios [RR] or odds ratios [ORs], 95% confidence intervals [CIs]) and statistics (e.g., chi-square and p value) should accompany the table. See Handout 8.1 for an example of a table shell created for the field investigation of oropharyngeal tularemia ( 1 ).

The first two tables usually generated as part of the analysis of data from a field investigation are those that describe clinical features of the case-patients and present the descriptive epidemiology. Because descriptive epidemiology is addressed in Chapter 6 , the remainder of this chapter addresses the analytic epidemiology tools used most commonly in field investigations.

Handout 8.2 depicts output from the Classic Analysis module of Epi Info 7 (Centers for Disease Control and Prevention, Atlanta, GA) ( 2 ). It demonstrates the output from the TABLES command for data from a typical field investigation. Note the key elements of the output: (1) a cross-tabulated table summarizing the results, (2) point estimates of measures of association, (3) 95% CIs for each point estimate, and (4) statistical test results. Each of these elements is discussed in the following sections.

Source: Adapted from Reference 1 .

Handout 8.2 : Time, by date of illness onset (could be included in Table 1, but for outbreaks, better to display as an epidemic curve).

Table 1 . Clinical features (e.g., signs and symptoms, percentage of laboratory-confirmed cases, percentage of hospitalized patients, and percentage of patients who died).

Table 2 . Demographic (e.g., age and sex) and other key characteristics of study participants by case–control status if case–control study.

Place (geographic area of residence or occurrence in Table 2 or in a spot or shaded map).

Table 3 . Primary tables of exposure-outcome association.

Table 4 . Stratification (Table 3 with separate effects and assessment of confounding and effect modification).

Table 5 . Refinements (Table 3 with, for example, dose-response, latency, and use of more sensitive or more specific case definition).

Table 6 . Specific group analyses.

Two-by-Two Tables

A two-by-two table is so named because it is a cross-tabulation of two variables—exposure and health outcome—that each have two categories, usually “yes” and “no” ( Handout 8.3 ). The two-by-two table is the best way to summarize data that reflect the association between a particular exposure (e.g., consumption of a specific food) and the health outcome of interest (e.g., gastroenteritis). The association is usually quantified by calculating a measure of association (e.g., a risk ratio [RR] or OR) from the data in the two-by-two table (see the following section).

  • In a typical two-by-two table used in field epidemiology, disease status (e.g., ill or well, case or control) is represented along the top of the table, and exposure status (e.g., exposed or unexposed) along the side.
  • Depending on the exposure being studied, the rows can be labeled as shown in Table 8.3 , or for example, as exposed and unexposed or ever and never . By convention, the exposed group is placed on the top row.
  • Depending on the disease or health outcome being studied, the columns can be labeled as shown in Handout 8.3, or for example, as ill and well, case and control , or dead and alive . By convention, the ill or case group is placed in the left column.
  • The intersection of a row and a column in which a count is recorded is known as a cell . The letters a, b, c , and d within the four cells refer to the number of persons with the disease status indicated in the column heading at the top and the exposure status indicated in the row label to the left. For example, cell c contains the number of ill but unexposed persons. The row totals are labeled H 1 and H 0 (or H 2 [H for horizontal ]) and the columns are labeled V 1 and V 0 (or V 2 [V for vertical ]). The total number of persons included in the two-by-two table is written in the lower right corner and is represented by the letter T or N .
  • If the data are from a cohort study, attack rates (i.e., the proportion of persons who become ill during the time period of interest) are sometimes provided to the right of the row totals. RRs or ORs, CIs, or p values are often provided to the right of or beneath the table.

The illustrative cross-tabulation of tap water consumption (exposure) and illness status (outcome) from the investigation of oropharyngeal tularemia is displayed in Table 8.2 ( 1 ).

Table Shell: Association Between Drinking Water From Different Sources And Oropharyngeal Tularemia (Sancaktepe Village, Bayburt Province, Turkey, July– August 2013)

Abbreviation: CI, confidence interval. Adapted from Reference 1 .

Typical Output From Classic Analysis Module, Epi Info Version 7, Using The Tables Command

Source: Reference 2 .

Table Shell: Association Between Drinking Water From Different Sources and Oropharyngeal Tularemia (Sancaktepe Village, Bayburt Province, Turkey, July– August 2013)

Abbreviation: CI, confidence interval.

Risk ratio = 26.59 / 10.59 = 2.5; 95% confidence interval = (1.3–4.9); chi-square (uncorrected) = 8.7 (p = 0.003). Source: Adapted from Reference 1.

Measures of Association

A measure of association quantifies the strength or magnitude of the statistical association between an exposure and outcome. Measures of association are sometimes called measures of effect because if the exposure is causally related to the health outcome, the measure quantifies the effect of exposure on the probability that the health outcome will occur.

The measures of association most commonly used in field epidemiology are all ratios—RRs, ORs, prevalence ratios (PRs), and prevalence ORs (PORs). These ratios can be thought of as comparing the observed with the expected—that is, the observed amount of disease among persons exposed versus the expected (or baseline) amount of disease among persons unexposed. The measures clearly demonstrate whether the amount of disease among the exposed group is similar to, higher than, or lower than (and by how much) the amount of disease in the baseline group.

  • The value of each measure of association equals 1.0 when the amount of disease is the same among the exposed and unexposed groups.
  • The measure has a value greater than 1.0 when the amount of disease is greater among the exposed group than among the unexposed group, consistent with a harmful effect.
  • The measure has a value less than 1.0 when the amount of disease among the exposed group is less than it is among the unexposed group, as when the exposure protects against occurrence of disease (e.g., vaccination).

Different measures of association are used with different types of studies. The most commonly used measure in a typical outbreak investigation retrospective cohort study is the RR , which is simply the ratio of attack rates. For most case–control studies, because attack rates cannot be calculated, the measure of choice is the OR .

Cross-sectional studies or surveys typically measure prevalence (existing cases) rather than incidence (new cases) of a health condition. Prevalence measures of association analogous to the RR and OR—the PR and POR , respectively—are commonly used.

Risk Ratio (Relative Risk)

The RR, the preferred measure for cohort studies, is calculated as the attack rate (risk) among the exposed group divided by the attack rate (risk) among the unexposed group. Using the notations in Handout 8.3,

RR=risk exposed /risk unexposed = (a/H 1 ) / (c/H 0 )

From Table 8.2 , the attack rate (i.e., risk) for acquiring oropharyngeal tularemia among persons who had drunk tap water at the banquet was 26.6%. The attack rate (i.e., risk) for those who had not drunk tap water was 10.6%. Thus, the RR is calculated as 0.266/ 0.106 = 2.5. That is, persons who had drunk tap water were 2.5 times as likely to become ill as those who had not drunk tap water ( 1 ).

The OR is the preferred measure of association for case–control data. Conceptually, it is calculated as the odds of exposure among case-patients divided by the odds of exposure among controls. However, in practice, it is calculated as the cross-product ratio. Using the notations in Handout 8.3,

The illustrative data in Handout 8.4 are from a case–control study of acute renal failure in Panama in 2006 (3). Because the data are from a case–control study, neither attack rates (risks) nor an RR can be calculated. The OR—calculated as 37 × 110/ (29 × 4) = 35.1—is exceptionally high, indicating a strong association between ingesting liquid cough syrup and acute renal failure.

Confounding is the distortion of an exposure–outcome association by the effect of a third factor (a confounder ). A third factor might be a confounder if it is

  • Associated with the outcome independent of the exposure—that is, it must be an independent risk factor; and,
  • Associated with the exposure but is not a consequence of it.

Consider a hypothetical retrospective cohort study of mortality among manufacturing employees that determined that workers involved with the manufacturing process were substantially more likely to die during the follow-up period than office workers and salespersons in the same industry.

  • The increase in mortality reflexively might be attributed to one or more exposures during the manufacturing process.
  • If, however, the manufacturing workers’ average age was 15 years older than the other workers, mortality reasonably could be expected to be higher among the older workers.
  • In that situation, age likely is a confounder that could account for at least some of the increased mortality. (Note that age satisfies the two criteria described previously: increasing age is associated with increased mortality, regardless of occupation; and, in that industry, age was associated with job—specifically, manufacturing employees were older than the office workers).

Unfortunately, confounding is common. The first step in dealing with confounding is to look for it. If confounding is identified, the second step is to control for or adjust for its distorting effect by using available statistical methods.

Looking for Confounding

The most common method for looking for confounding is to stratify the exposure–outcome association of interest by the third variable suspected to be a confounder.

  • Because one of the two criteria for a confounding variable is that it should be associated with the outcome, the list of potential confounders should include the known risk factors for the disease. The list also should include matching variables. Because age frequently is a confounder, it should be considered a potential confounder in any data set.
  • For each stratum, compute a stratum-specific measure of association. If the stratification variable is sex, only women will be in one stratum and only men in the other. The exposure–outcome association is calculated separately for women and for men. Sex can no longer be a confounder in these strata because women are compared with women and men are compared with men.

The OR is a useful measure of association because it provides an estimate of the association between exposure and disease from case–control data when an RR cannot be calculated. Additionally, when the outcome is relatively uncommon among the population (e.g., <5%), the OR from a case–control study approximates the RR that would have been derived from a cohort study, had one been performed. However, when the outcome is more common, the OR overestimates the RR.

Prevalence Ratio and Prevalence Odds Ratio

Cross-sectional studies or surveys usually measure the prevalence rather than incidence of a health status (e.g., vaccination status) or condition (e.g., hypertension) among a population. The prevalence measures of association analogous to the RR and OR are, respectively, the PR and POR .

The PR is calculated as the prevalence among the index group divided by the prevalence among the comparison group. Using the notations in Handout 8.3 ,

PR = prevalence index / prevalence comparison = (a/H 1 ) / (c/H 0 )

The POR is calculated like an OR.

POR = ad/bc

In a study of HIV seroprevalence among current users of crack cocaine versus never users, 165 of 780 current users were HIV-positive (prevalence = 21.2%), compared with 40 of 464 never users (prevalence = 8.6%) (4). The PR and POR were close (2.5 and 2.8, respectively), but the PR is easier to explain.

Odds ratio = 35.1; 95% confidence interval = (11.6–106.4); chi-square (uncorrected) = 65.6 (p<0.001). Source: Adapted from Reference 3 .

Measures of Public Health Impact

A measure of public health impact places the exposure–disease association in a public health perspective. The impact measure reflects the apparent contribution of the exposure to the health outcome among a population. For example, for an exposure associated with an increased risk for disease (e.g., smoking and lung cancer), the attributable risk percent represents the amount of lung cancer among smokers ascribed to smoking, which also can be regarded as the expected reduction in disease load if the exposure could be removed or had never existed.

For an exposure associated with a decreased risk for disease (e.g., vaccination), the prevented fraction represents the observed reduction in disease load attributable to the current level of exposure among the population. Note that the terms attributable and prevented convey more than mere statistical association. They imply a direct cause-and-effect relationship between exposure and disease. Therefore, these measures should be presented only after thoughtful inference of causality.

Attributable Risk Percent

The attributable risk percent (attributable fraction or proportion among the exposed, etiologic fraction) is the proportion of cases among the exposed group presumably attributable to the exposure. This measure assumes that the level of risk among the unexposed group (who are considered to have the baseline or background risk for disease) also applies to the exposed group, so that only the excess risk should be attributed to the exposure. The attributable risk percent can be calculated with either of the following algebraically equivalent formulas:

Attributable risk percent = (risk exposed / risk unexposed ) / risk exposed = (RR–1) / RR

In a case– control study, if the OR is a reasonable approximation of the RR, an attributable risk percent can be calculated from the OR.

Attributable risk percent = (OR–1) / OR

In the outbreak setting, attributable risk percent can be used to quantify how much of the disease burden can be ascribed to particular exposure.

Prevented Fraction Among the Exposed Group (Vaccine Efficacy)

The prevented fraction among the exposed group can be calculated when the RR or OR is less than 1.0. This measure is the proportion of potential cases prevented by a beneficial exposure (e.g., bed nets that prevent nighttime mosquito bites and, consequently, malaria). It can also be regarded as the proportion of new cases that would have occurred in the absence of the beneficial exposure. Algebraically, the prevented fraction among the exposed population is identical to vaccine efficacy.

Prevented fraction among the exposed group = vaccine efficacy = (risk exposed / risk unexposed ) /= risk unexposed = 1 RR

Handout 8.5 displays data from a varicella (chickenpox) outbreak at an elementary school in Nebraska in 2004 ( 5 ). The risk for varicella was 13.6% among vaccinated children and 66.7% among unvaccinated children. The vaccine efficacy based on these data was calculated as (0.667 – 0.130)/ 0.667 = 0.805, or 80.5%. This vaccine efficacy of 80.5% indicates that vaccination prevented approximately 80% of the cases that would have otherwise occurred among vaccinated children had they not been vaccinated.

Risk ratio = 13.0/ 66.7 = 0.195; vaccine efficacy = (66.7 − 13.0)/ 66.7 = 80.5%. Source: Adapted from Reference 5 .

Tests of Statistical Significance

Tests of statistical significance are used to determine how likely the observed results would have occurred by chance alone if exposure was unrelated to the health outcome. This section describes the key factors to consider when applying statistical tests to data from two-by-two tables.

  • Statistical testing begins with the assumption that, among the source population, exposure is unrelated to disease. This assumption is known as the null hypothesis . The alternative hypothesis , which will be adopted if the null hypothesis proves to be implausible, is that exposure is associated with disease.
  • Next, compute a measure of association (e.g., an RR or OR).
  • A small p value means that you would be unlikely to observe such an association if the null hypothesis were true. In other words, a small p value indicates that the null hypothesis is implausible, given available data.
  • If this p value is smaller than a predetermined cutoff, called alpha (usually 0.05 or 5%), you discard (reject) the null hypothesis in favor of the alternative hypothesis. The association is then said to be statistically significant .
  • If the p value is larger than the cutoff (e.g., p value >0.06), do not reject the null hypothesis; the apparent association could be a chance finding.
  • In a type I error (also called alpha error ), the null hypothesis is rejected when in fact it is true.
  • In a type II error (also called beta error ), the null hypothesis is not rejected when in fact it is false.

Testing and Interpreting Data in a Two-by-Two Table

For data in a two-by-two table Epi Info reports the results from two different tests—chi-square test and Fisher exact test—each with variations ( Handout 8.2 ). These tests are not specific to any particular measure of association. The same test can be used regardless of whether you are interested in RR, OR, or attributable risk percent.

  • If the expected value in any cell is less than 5. Fisher exact test is the commonly accepted standard when the expected value in any cell is less than 5. (Remember: The expected value for any cell can be determined by multiplying the row total by the column total and dividing by the table total.)
  • If all expected values in the two-by-two table are 5 or greater. Choose one of the chi-square tests. Fortunately, for most analyses, the three chi-square formulas provide p values sufficiently similar to make the same decision regarding the null hypothesis based on all three. However, when the different formulas point to different decisions (usually when all three p values are approximately 0.05), epidemiologic judgment is required. Some field epidemiologists prefer the Yates-corrected formula because they are least likely to make a type I error (but most likely to make a type II error). Others acknowledge that the Yates correction often overcompensates; therefore, they prefer the uncorrected formula. Epidemiologists who frequently perform stratified analyses are accustomed to using the Mantel-Haenszel formula; therefore, they tend to use this formula even for simple two-by-two tables.
  • Measure of association. The measures of association (e.g., RRs and ORs) reflect the strength of the association between an exposure and a disease. These measures are usually independent of the size of the study and can be regarded as the best guess of the true degree of association among the source population. However, the measure gives no indication of its reliability (i.e., how much faith to put in it).
  • Test of significance. In contrast, a test of significance provides an indication of how likely it is that the observed association is the result of chance. Although the chi-square test statistic is influenced both by the magnitude of the association and the study size, it does not distinguish the contribution of each one. Thus, the measure of association and the test of significance (or a CI; see Confidence Intervals for Measures of Association) provide complementary information.
  • Role of statistical significance. Statistical significance does not by itself indicate a cause-and-effect association. An observed association might indeed represent a causal connection, but it might also result from chance, selection bias, information bias, confounding, or other sources of error in the study’s design, execution, or analysis. Statistical testing relates only to the role of chance in explaining an observed association, and statistical significance indicates only that chance is an unlikely, although not impossible, explanation of the association. Epidemiologic judgment is required when considering these and other criteria for inferring causation (e.g., consistency of the findings with those from other studies, the temporal association between exposure and disease, or biologic plausibility).
  • Public health implications of statistical significance. Finally, statistical significance does not necessarily mean public health significance. With a large study, a weak association with little public health or clinical relevance might nonetheless be statistically significant. More commonly, if a study is small, an association of public health or clinical importance might fail to reach statistically significance.

Confidence Intervals for Measures of Association

Many medical and public health journals now require that associations be described by measures of association and CIs rather than p values or other statistical tests. A measure of association such as an RR or OR provides a single value (point estimate) that best quantifies the association between an exposure and health outcome. A CI provides an interval estimate or range of values that acknowledge the uncertainty of the single number point estimate, particularly one that is based on a sample of the population.

The 95% Confidence Interval

Statisticians define a 95% CI as the interval that, given repeated sampling of the source population, will include, or cover, the true association value 95% of the time. The epidemiologic concept of a 95% CI is that it includes range of values consistent with the data in the study ( 6 ).

Relation Between Chi-Square Test and Confidence Interval

The chi-square test and the CI are closely related. The chi-square test uses the observed data to determine the probability ( p value) under the null hypothesis, and one rejects the null hypothesis if the probability is less than alpha (e.g., 0.05). The CI uses a preselected probability value, alpha (e.g., 0.05), to determine the limits of the interval (1 − alpha = 0.95), and one rejects the null hypothesis if the interval does not include the null association value. Both indicate the precision of the observed association; both are influenced by the magnitude of the association and the size of the study group. Although both measure precision, neither addresses validity (lack of bias).

Interpreting the Confidence Interval

  • Meaning of a confidence interval . A CI can be regarded as the range of values consistent with the data in a study. Suppose a study conducted locally yields an RR of 4.0 for the association between intravenous drug use and disease X; the 95% CI ranges from 3.0 to 5.3. From that study, the best estimate of the association between intravenous drug use and disease X among the general population is 4.0, but the data are consistent with values anywhere from 3.0 to 5.3. A study of the same association conducted elsewhere that yielded an RR of 3.2 or 5.2 would be considered compatible, but a study that yielded an RR of 1.2 or 6.2 would not be considered compatible. Now consider a different study that yields an RR of 1.0, a CI from 0.9 to 1.1, and a p value = 0.9. Rather than interpreting these results as nonsignificant and uninformative, you can conclude that the exposure neither increases nor decreases the risk for disease. That message can be reassuring if the exposure had been of concern to a worried public. Thus, the values that are included in the CI and values that are excluded by the CI both provide important information.
  • Width of the confidence interval. The width of a CI (i.e., the included values) reflects the precision with which a study can pinpoint an association. A wide CI reflects a large amount of variability or imprecision. A narrow CI reflects less variability and higher precision. Usually, the larger the number of subjects or observations in a study, the greater the precision and the narrower the CI.
  • Relation of the confidence interval to the null hypothesis. Because a CI reflects the range of values consistent with the data in a study, the CI can be used as a substitute for statistical testing (i.e., to determine whether the data are consistent with the null hypothesis). Remember: the null hypothesis specifies that the RR or OR equals 1.0; therefore, a CI that includes 1.0 is compatible with the null hypothesis. This is equivalent to concluding that the null hypothesis cannot be rejected. In contrast, a CI that does not include 1.0 indicates that the null hypothesis should be rejected because it is inconsistent with the study results. Thus, the CI can be used as a surrogate test of statistical significance.

Confidence Intervals in the Foodborne Outbreak Setting

In the setting of a foodborne outbreak, the goal is to identify the food or other vehicle that caused illness. In this setting, a measure of the association (e.g., an RR or OR) is calculated to identify the food(s) or other consumable(s) with high values that might have caused the outbreak. The investigator does not usually care if the RR for a specific food item is 5.7 or 9.3, just that the RR is high and unlikely to be caused by chance and, therefore, that the item should be further evaluated. For that purpose, the point estimate (RR or OR) plus a p value is adequate and a CI is unnecessary.

For field investigations intended to identify one or more vehicles or risk factors for disease, consider constructing a single table that can summarize the associations for multiple exposures of interest. For foodborne outbreak investigations, the table typically includes one row for each food item and columns for the name of the food; numbers of ill and well persons, by food consumption history; food-specific attack rates (if a cohort study was conducted); RR or OR; chi-square or p value; and, sometimes, a 95% CI. The food most likely to have caused illness will usually have both of the following characteristics:

  • An elevated RR, OR, or chi-square (small p value), reflecting a substantial difference in attack rates among those who consumed that food and those who did not.
  • The majority of the ill persons had consumed that food; therefore, the exposure can explain or account for most if not all of the cases.

In illustrative summary Table 8.3 , tap water had the highest RR (and the only p value <0.05, based on the 95% CI excluding 1.0) and might account for 46 of 55 cases.

Abbreviation: CI, confidence interval. Source: Adapted from Reference 1 .

Stratification is the examination of an exposure–disease association in two or more categories (strata) of a third variable (e.g., age). It is a useful tool for assessing whether confounding is present and, if it is, controlling for it. Stratification is also the best method for identifying effect modification . Both confounding and effect modification are addressed in following sections.

Stratification is also an effective method for examining the effects of two different exposures on a disease. For example, in a foodborne outbreak, two foods might seem to be associated with illness on the basis of elevated RRs or ORs. Possibly both foods were contaminated or included the same contaminated ingredient. Alternatively, the two foods might have been eaten together (e.g., peanut butter and jelly or doughnuts and milk), with only one being contaminated and the other guilty by association. Stratification is one way to tease apart the effects of the two foods.

Creating Strata of Two-by-Two Tables

  • To stratify by sex, create a two-by-two table for males and another table for females.
  • To stratify by age, decide on age groupings, making certain not to have overlapping ages; then create a separate two-by-two table for each age group.
  • For example, the data in Table 8.2 are stratified by sex in Handouts 8.6 and 8.7 . The RR for drinking tap water and experiencing oropharyngeal tularemia is 2.3 among females and 3.6 among males, but stratification also allows you to see that women have a higher risk than men, regardless of tap water consumption.

The Two-by-Four Table

Stratified tables (e.g., Handouts 8.6 and 8.7 ) are useful when the stratification variable is not of primary interest (i.e., is not being examined as a cause of the outbreak). However, when each of the two exposures might be the cause, a two-by-four table is better for disentangling the effects of the two variables. Consider a case–control study of a hypothetical hepatitis A outbreak that yielded elevated ORs both for doughnuts (OR = 6.0) and milk (OR = 3.9). The data organized in a two-by-four table ( Handout 8.8 ) disentangle the effects of the two foods—exposure to doughnuts alone is strongly associated with illness (OR = 6.0), but exposure to milk alone is not (OR = 1.0).

When two foods cause illness—for example when they are both contaminated or have a common ingredient—the two-by-four table is the best way to see their individual and joint effects.

Source: Adapted from Reference 1.

Crude odds ratio for doughnuts = 6.0; crude odds ratio for milk = 3.9.

  • To look for confounding, first examine the smallest and largest values of the stratum-specific measures of association and compare them with the value of the combined table (called the crude value ). Confounding is present if the crude value is outside the range between the smallest and largest stratum-specific values.
  • If the crude risk ratio or odds ratio is outside the range of the stratum-specific ones.
  • If the crude risk ratio or odds ratio differs from the Mantel-Haenszel adjusted one by >10% or >20%.

Controlling for Confounding

  • One method of controlling for confounding is by calculating a summary RR or OR based on a weighted average of the stratum-specific data. The Mantel-Haenszel technique ( 6 ) is a popular method for performing this task.
  • A second method is by using a logistic regression model that includes the exposure of interest and one or more confounding variables. The model produces an estimate of the OR that controls for the effect of the confounding variable(s).

Effect modification or effect measure modification means that the degree of association between an exposure and an outcome differs among different population groups. For example, measles vaccine is usually highly effective in preventing disease if administered to children aged 12 months or older but is less effective if administered before age 12 months. Similarly, tetracycline can cause tooth mottling among children, but not adults. In both examples, the association (or effect) of the exposure (measles vaccine or tetracycline) is a function of, or is modified by, a third variable (age in both examples).

Because effect modification means different effects among different groups, the first step in looking for effect modification is to stratify the exposure–outcome association of interest by the third variable suspected to be the effect modifier. Next, calculate the measure of association (e.g., RR or OR) for each stratum. Finally, assess whether the stratum-specific measures of association are substantially different by using one of two methods.

  • Examine the stratum-specific measures of association. Are they different enough to be of public health or scientific importance?
  • Determine whether the variation in magnitude of the association is statistically significant by using the Breslow-Day Test for homogeneity of odds ratios or by testing the interaction term in logistic regression.

If effect modification is present, present each stratum-specific result separately.

In epidemiology, dose-response means increased risk for the health outcome with increasing (or, for a protective exposure, decreasing) amount of exposure. Amount of exposure reflects quantity of exposure (e.g., milligrams of folic acid or number of scoops of ice cream consumed), or duration of exposure (e.g., number of months or years of exposure), or both.

The presence of a dose-response effect is one of the well-recognized criteria for inferring causation. Therefore, when an association between an exposure and a health outcome has been identified based on an elevated RR or OR, consider assessing for a dose-response effect.

As always, the first step is to organize the data. One convenient format is a 2-by-H table, where H represents the categories or doses of exposure. An RR for a cohort study or an OR for a case–control study can be calculated for each dose relative to the lowest dose or the unexposed group ( Handout 8.9 ). CIs can be calculated for each dose. Reviewing the data and the measures of association in this format and displaying the measures graphically can provide a sense of whether a dose-response association is present. Additionally, statistical techniques can be used to assess such associations, even when confounders must be considered.

The basic data layout for a matched-pair analysis is a two-by-two table that seems to resemble the simple unmatched two-by-two tables presented earlier in this chapter, but it is different ( Handout 8.10 ). In the matched-pair two-by-two table, each cell represents the number of matched pairs that meet the row and column criteria. In the unmatched two-by-two table, each cell represents the number of persons who meet the criteria.

In Handout 8.10 , cell e contains the number of pairs in which the case-patient is exposed and the control is exposed; cell f contains the number of pairs with an exposed case-patient and an unexposed control, cell g contains the number of pairs with an unexposed case-patient and an exposed control, and cell h contains the number of pairs in which neither the case-patient nor the matched control is exposed. Cells e and h are called concordant pairs because the case-patient and control are in the same exposure category. Cells f and g are called discordant pairs .

Odds ratio = f/  g.

In a matched-pair analysis, only the discordant pairs are used to calculate the OR. The OR is computed as the ratio of the discordant pairs.

The test of significance for a matched-pair analysis is the McNemar chi-square test.

Handout 8.11 displays data from the classic pair-matched case–control study conducted in 1980 to assess the association between tampon use and toxic shock syndrome ( 7 ).

Odds ratio = 9/ 1 = 9.0; uncorrected McNemar chi-square test = 6.40 (p = 0.01). Source: Adapted from Reference 7 .

  • Larger matched sets and variable matching. In certain studies, two, three, four, or a variable number of controls are matched with case-patients. The best way to analyze these larger or variable matched sets is to consider each set (e.g., triplet or quadruplet) as a unique stratum and then analyze the data by using the Mantel-Haenszel methods or logistic regression to summarize the strata (see Controlling for Confounding).
  • Does a matched design require a matched analysis? Usually, yes. In a pair-matched study, if the pairs are unique (e.g., siblings or friends), pair-matched analysis is needed. If the pairs are based on a nonunique characteristic (e.g., sex or grade in school), all of the case-patients and all of the controls from the same stratum (sex or grade) can be grouped together, and a stratified analysis can be performed.

In practice, some epidemiologists perform the matched analysis but then perform an unmatched analysis on the same data. If the results are similar, they might opt to present the data in unmatched fashion. In most instances, the unmatched OR will be closer to 1.0 than the matched OR (bias toward the null). This bias, which is related to confounding, might be either trivial or substantial. The chi-square test result from unmatched data can be particularly misleading because it is usually larger than the McNemar test result from the matched data. The decision to use a matched analysis or unmatched analysis is analogous to the decision to present crude or adjusted results; epidemiologic judgment must be used to avoid presenting unmatched results that are misleading.

Logistic Regression

In recent years, logistic regression has become a standard tool in the field epidemiologist’s toolkit because user-friendly software has become widely available and its ability to assess effects of multiple variables has become appreciated. Logistic regression is a statistical modeling method analogous to linear regression but for a binary outcome (e.g., ill/well or case/control). As with other types of regression, the outcome (the dependent variable) is modeled as a function of one or more independent variables. The independent variables include the exposure(s) of interest and, often, confounders and interaction terms.

  • The exponentiation of a given beta coefficient (e β ) equals the OR for that variable while controlling for the effects of all of the other variables in the model.
  • If the model includes only the outcome variable and the primary exposure variable coded as (0,1), e β should equal the OR you can calculate from the two-by-two table. For example, a logistic regression model of the oropharyngeal tularemia data with tap water as the only independent variable yields an OR of 3.06, exactly the same value to the second decimal as the crude OR. Similarly, a model that includes both tap water and sex as independent variables yields an OR for tap water of 3.24, almost identical to the Mantel-Haenszel OR for tap water controlling for sex of 3.26. (Note that logistic regression provides ORs rather than RRs, which is not ideal for field epidemiology cohort studies.)
  • Logistic regression also can be used to assess dose-response associations, effect modification, and more complex associations. A variant of logistic regression called conditional logistic regression is particularly appropriate for pair-matched data.

Sophisticated analytic techniques cannot atone for sloppy data ! Analytic techniques such as those described in this chapter are only as good as the data to which they are applied. Analytic techniques—whether simple, stratified, or modeling—use the information at hand. They do not know or assess whether the correct comparison group was selected, the response rate was adequate, exposure and outcome were accurately defined, or the data coding and entry were free of errors. Analytic techniques are merely tools; the analyst is responsible for knowing the quality of the data and interpreting the results appropriately.

A computer can crunch numbers more quickly and accurately than the investigator can by hand, but the computer cannot interpret the results. For a two-by-two table, Epi Info provides both an RR and an OR, but the investigator must choose which is best based on the type of study performed. For that table, the RR and the OR might be elevated; the p value might be less than 0.05; and the 95% CI might not include 1.0. However, do those statistical results guarantee that the exposure is a true cause of disease? Not necessarily. Although the association might be causal, flaws in study design, execution, and analysis can result in apparent associations that are actually artifacts. Chance, selection bias, information bias, confounding, and investigator error should all be evaluated as possible explanations for an observed association. The first step in evaluating whether an apparent association is real and causal is to review the list of factors that can cause a spurious association, as listed in Epidemiologic Interpretation Checklist 1 ( Box 8.4 ).

  • Selection bias
  • Information bias
  • Investigator error
  • True association

Epidemiologic Interpretation Checklist 1

Chance is one possible explanation for an observed association between exposure and outcome. Under the null hypothesis, you assume that your study population is a sample from a source population in which that exposure is not associated with disease; that is, the RR and OR equal 1. Could an elevated (or lowered) OR be attributable simply to variation caused by chance? The role of chance is assessed by using tests of significance (or, as noted earlier, by interpreting CIs). Chance is an unlikely explanation if

  • The p value is less than alpha (usually set at 0.05), or
  • The CI for the RR or OR excludes 1.0.

However, chance can never be ruled out entirely. Even if the p value is as small as 0.01, that study might be the one study in 100 in which the null hypothesis is true and chance is the explanation. Note that tests of significance evaluate only the role of chance—they do not address the presence of selection bias, information bias, confounding, or investigator error.

Selection bias is a systematic error in the designation of the study groups or in the enrollment of study participants that results in a mistaken estimate of an exposure’s effect on the risk for disease. Selection bias can be thought of as a problem resulting from who gets into the study or how. Selection bias can arise from the faulty design of a case– control study through, for example, use of an overly broad case definition (so that some persons in the case group do not actually have the disease being studied) or inappropriate control group, or when asymptomatic cases are undetected among the controls. In the execution phase, selection bias can result if eligible persons with certain exposure and disease characteristics choose not to participate or cannot be located. For example, if ill persons with the exposure of interest know the hypothesis of the study and are more willing to participate than other ill persons, cell a in the two-by-two table will be artificially inflated compared with cell c , and the OR also will be inflated. Evaluating the possible role of selection bias requires examining how case-patients and controls were specified and were enrolled.

Information bias is a systematic error in the data collection from or about the study participants that results in a mistaken estimate of an exposure’s effect on the risk for disease. Information bias might arise by including poor wording or understanding of a question on a questionnaire; poor recall; inconsistent interviewing technique; or if a person knowingly provides false information, either to hide the truth or, as is common among certain cultures, in an attempt to please the interviewer.

Confounding is the distortion of an exposure–disease association by the effect of a third factor, as discussed earlier in this chapter. To evaluate the role of confounding, ensure that potential confounders have been identified, evaluated, and controlled for as necessary.

Investigator error can occur at any step of a field investigation, including design, conduct, analysis, and interpretation. In the analysis, a misplaced semicolon in a computer program, an erroneous transcription of a value, use of the wrong formula, or misreading of results can all yield artifactual associations. Preventing this type of error requires rigorous checking of work and asking colleagues to carefully review the work and conclusions.

To reemphasize, before considering whether an association is causal, consider whether the association can be explained by chance, selection bias, information bias, confounding, or investigator error . Now suppose that an elevated RR or OR has a small p value and narrow CI that does not include 1.0; therefore, chance is an unlikely explanation. Specification of case-patients and controls was reasonable and participation was good; therefore, selection bias is an unlikely explanation. Information was collected by using a standard questionnaire by an experienced and well-trained interviewer. Confounding by other risk factors was assessed and determined not to be present or to have been controlled for. Data entry and calculations were verified. However, before concluding that the association is causal, the strength of the association, its biologic plausibility, consistency with results from other studies, temporal sequence, and dose-response association, if any, need to be considered ( Box 8.5 ).

  • Strength of the association
  • Biologic plausibility
  • Consistency with other studies
  • Exposure precedes disease
  • Dose-response effect

Epidemiologic Interpretation Checklist 2

Strength of the association means that a stronger association has more causal credibility than a weak one. If the true RR is 1.0, subtle selection bias, information bias, or confounding can result in an RR of 1.5, but the bias would have to be dramatic and hopefully obvious to the investigator to account for an RR of 9.0.

Biological plausibility means an association has causal credibility if is consistent with the known pathophysiology, known vehicles, natural history of the health outcome, animal models, and other relevant biological factors. For an implicated food vehicle in an infectious disease outbreak, has the food been implicated in previous outbreaks, or—even better—has the agent been identified in the food? Although some outbreaks are caused by new or previously unrecognized pathogens, vehicles, or risk factors, most are caused by those that have been recognized previously.

Consider c onsistency with other studies . Are the results consistent with those from previous studies? A finding is more plausible if it has been replicated by different investigators using different methods for different populations.

Exposure precedes disease seems obvious, but in a retrospective cohort study, documenting that exposure precedes disease can be difficult. Suppose, for example, that persons with a particular type of leukemia are more likely than controls to have antibodies to a particular virus. It might be tempting to conclude that the virus caused the leukemia, but caution is required because viral infection might have occurred after the onset of leukemic changes.

Evidence of a dose-response effect adds weight to the evidence for causation. A dose-response effect is not a necessary feature for an association to be causal; some causal association might exhibit a threshold effect, for example. Nevertheless, it is usually thought to add credibility to the association.

In many field investigations, a likely culprit might not meet all the criteria discussed in this chapter. Perhaps the response rate was less than ideal, the etiologic agent could not be isolated from the implicated food, or no dose-response was identified. Nevertheless, if the public’s health is at risk, failure to meet every criterion should not be used as an excuse for inaction. As George Comstock stated, “The art of epidemiologic reasoning is to draw sensible conclusions from imperfect data” ( 8 ). After all, field epidemiology is a tool for public health action to promote and protect the public’s health on the basis of science (sound epidemiologic methods), causal reasoning, and a healthy dose of practical common sense.

All scientific work is incomplete—whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have, or to postpone the action it seems to demand at a given time ( 9 ).

— Sir Austin Bradford Hill (1897–1991), English Epidemiologist and Statistician

  • Aktas D, Celebi B, Isik ME, et al. Oropharyngeal tularemia outbreak associated with drinking contaminated tap water, Turkey, July–September 2013. Emerg Infect Dis. 2015;21:2194–6.
  • Centers for Disease Control and Prevention. Epi Info. https://www.cdc.gov/epiinfo/index.html
  • Rentz ED, Lewis L, Mujica OJ, et al. Outbreak of acute renal failure in Panama in 2006: a case-–control study. Bull World Health Organ. 2008;86:749–56.
  • Edlin BR, Irwin KL, Faruque S, et al. Intersecting epidemics—crack cocaine use and HIV infection among inner-city young adults. N Eng J Med. 1994;331:1422–7.
  • Centers for Disease Control and Prevention. Varicella outbreak among vaccinated children—Nebraska, 2004. MMWR. 2006;55;749–52.
  • Rothman KJ. Epidemiology: an introduction . New York: Oxford University Press; 2002: p . 113–29.
  • Shands KN, Schmid GP, Dan BB, et al. Toxic-shock syndrome in menstruating women: association with tampon use and Staphylococcus aureus and clinical features in 52 women. N Engl J Med . 1980;303:1436–42.
  • Comstock GW. Vaccine evaluation by case–control or prospective studies. Am J Epidemiol. 1990;131:205–7.
  • Hill AB. The environment and disease: association or causation? Proc R Soc Med. 1965;58:295–300.

< Previous Chapter 7: Designing and Conducting Analytic Studies in the Field

Next Chapter 9: Optimizing Epidemiology– Laboratory Collaborations >

The fellowship application period will be open March 1-June 5, 2024.

The host site application period is closed.

For questions about the EIS program, please contact us directly at [email protected] .

  • Laboratory Leadership Service (LLS)
  • Fellowships and Training Opportunities
  • Division of Workforce Development

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

Logo for British Columbia/Yukon Open Authoring Platform

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 10: Qualitative Data Collection & Analysis Methods

10.5 Analysis of Qualitative Interview Data

Analysis of qualitative interview data typically begins with a set of transcripts of the interviews conducted. Obtaining said transcripts requires either having taken exceptionally good notes during an interview or, preferably, recorded the interview and then transcribed it. To transcribe an interview means to create a complete, written copy of the recorded interview by playing the recording back and typing in each word that is spoken on the recording, noting who spoke which words. In general, it is best to aim for a verbatim transcription, i.e., one that reports word for word exactly what was said in the recorded interview. If possible, it is also best to include nonverbal responses in the written transcription of an interview (if the interview is completed face-to-face, or some other form of visual contact is maintained, such as with Skype). Gestures made by respondents should be noted, as should the tone of voice and notes about when, where, and how spoken words may have been emphasized by respondents.

If you have the time, it is best to transcribe your interviews yourself. If the researcher who conducted the interviews transcribes them herself, that person will also be able to record associated nonverbal behaviors and interactions that may be relevant to analysis but that could not be picked up by audio recording. Interviewees may roll their eyes, wipe tears from their face, and even make obscene gestures that speak volumes about their feelings; however, such non-verbal gestures cannot be recorded, and being able to remember and record in writing these details as it relates to the transcribing of interviews is invaluable.

Overall, the goal of analysis is to reach some inferences, lessons, or conclusions by condensing large amounts of data into relatively smaller, more manageable bits of understandable information. Analysis of qualitative interview data often works inductively (Glaser & Strauss, 1967; Patton, 2001). To move from the specific observations an interviewer collects to identifying patterns across those observations, qualitative interviewers will often begin by reading through transcripts of their interviews and trying to identify codes. A code is a shorthand representation of some more complex set of issues or ideas. The process of identifying codes in one’s qualitative data is often referred to as coding . Coding involves identifying themes across interview data by reading and re-reading (and re-reading again) interview transcripts, until the researcher has a clear idea about what sorts of themes come up across the interviews. Coding helps to achieve the goal of data management and data reduction (Palys & Atchison, 2014, p. 304).

Coding can be inductive or deductive. Deductive coding is the approach used by research analysts who have a well-specified or pre-defined set of interests (Palys & Atchison, 2014, P. 304). The process of deductive coding begins with the analyst utilizing those specific or pre-defined interests to identify “relevant” passages, quotes, images, scenes, etc., to develop a set of preliminary codes (often referred to as descriptive coding ). From there, the analyst elaborates on these preliminary codes, making finer distinctions within each coding category (known as interpretative coding ). Pattern coding is another step an analyst might take as different associations become apparent. For example, if you are studying at-risk behaviours in youth, and you discover that the various behaviours have different characteristics and meanings depending upon the social context (e.g., school, family, work) in which the various behaviours occur, you have identified a pattern (Palys & Atchison, 2014, p. 304).

In contrast, inductive coding begins with the identification of general themes and ideas that emerge as the researcher reads through the data. This process is also referred to as open coding (Palys & Atchison, 2014, p. 305), because it will probably require multiple analyses. As you read through your transcripts, it is likely that you will begin to see some commonalities across the categories or themes that you’ve jotted down (Saylor Academy, 2012). The open coding process can go one of two ways: either the researcher elaborates on a category by making finer, and then even finer distinctions, or the researcher starts with a very specific descriptive category that is subsequently collapsed into another category (Palys & Atchison, 2014, p. 305). In other words, the development and elaboration of codes arise out of the material that is being examined.

The next step for the research analyst is to begin more specific coding, which is known as focused or axial coding . Focused coding involves collapsing or narrowing themes and categories identified in open coding by reading through the notes you made while conducting open coding, identifying themes or categories that seem to be related, and perhaps merging some. Then give each collapsed/merged theme or category a name (or code) and identify passages of data that fit each named category or theme. To identify passages of data that represent your emerging codes, you will need to read through your transcripts several times. You might also write up brief definitions or descriptions of each code. Defining codes is a way of giving meaning to your data, and developing a way to talk about your findings and what your data means (Saylor Academy, 2012).

As tedious and laborious as it might seem to read through hundreds of pages of transcripts multiple times, sometimes getting started with the coding process is actually the hardest part. If you find yourself struggling to identify themes at the open coding stage, ask yourself some questions about your data. The answers should give you a clue about what sorts of themes or categories you are reading (Saylor Academy, 2012). (Lofland and Lofland,1995, p. 2001) identify a set of questions that are useful when coding qualitative data. They suggest asking the following:

  • Of what topic, unit, or aspect is this an instance?
  • What question about a topic does this item of data suggest?
  • What sort of answer to a question about a topic does this item of data suggest (i.e., what proposition is suggested)?

Asking yourself these questions about the passages of data that you are reading can help you begin to identify and name potential themes and categories.

Table 10.3 “ Interview coding” example is drawn from research undertaken by Saylor Academy (Saylor Academy, 2012) where she presents two codes that emerged from her inductive analysis of transcripts from her interviews with child-free adults. Table 10.3 also includes a brief description of each code and a few (of many) interview excerpts from which each code was developed.

Table 10.3 Interview coding

Just as quantitative researchers rely on the assistance of special computer programs designed to help sort through and analyze their data, so, do qualitative researchers. Where quantitative researchers have SPSS and MicroCase (and many others), qualitative researchers have programs such as NVivo ( http://www.qsrinternational.com ) and Atlasti ( http://www.atlasti.com ). These are programs specifically designed to assist qualitative researchers to organize, manage, sort, and analyze large amounts of qualitative data. The programs allow researchers to import interview transcripts contained in an electronic file and then label or code passages, cut and paste passages, search for various words or phrases, and organize complex interrelationships among passages and codes

Research Methods for the Social Sciences: An Introduction Copyright © 2020 by Valerie Sheppard is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

  • Python For Data Analysis
  • Data Science
  • Data Analysis with R
  • Data Analysis with Python
  • Data Visualization with Python
  • Data Analysis Examples
  • Math for Data Analysis
  • Data Analysis Interview questions
  • Artificial Intelligence
  • Data Analysis Projects
  • Machine Learning
  • Deep Learning
  • Computer Vision
  • Types of Research - Methods Explained with Examples
  • GRE Data Analysis | Methods for Presenting Data
  • Financial Analysis: Objectives, Methods, and Process
  • Financial Analysis: Need, Types, and Limitations
  • Methods of Marketing Research
  • Top 10 SQL Projects For Data Analysis
  • What is Statistical Analysis in Data Science?
  • 10 Data Analytics Project Ideas
  • Predictive Analysis in Data Mining
  • How to Become a Research Analyst?
  • Data Analytics and its type
  • Types of Social Networks Analysis
  • What is Data Analysis?
  • Six Steps of Data Analysis Process
  • Multidimensional data analysis in Python
  • Attributes and its Types in Data Analytics
  • Exploratory Data Analysis (EDA) - Types and Tools
  • Data Analyst Jobs in Pune

Data Analysis in Research: Types & Methods

Data analysis is a crucial step in the research process, transforming raw data into meaningful insights that drive informed decisions and advance knowledge. This article explores the various types and methods of data analysis in research, providing a comprehensive guide for researchers across disciplines.

Data-Analysis-in-Research

Data Analysis in Research

Overview of Data analysis in research

Data analysis in research is the systematic use of statistical and analytical tools to describe, summarize, and draw conclusions from datasets. This process involves organizing, analyzing, modeling, and transforming data to identify trends, establish connections, and inform decision-making. The main goals include describing data through visualization and statistics, making inferences about a broader population, predicting future events using historical data, and providing data-driven recommendations. The stages of data analysis involve collecting relevant data, preprocessing to clean and format it, conducting exploratory data analysis to identify patterns, building and testing models, interpreting results, and effectively reporting findings.

  • Main Goals : Describe data, make inferences, predict future events, and provide data-driven recommendations.
  • Stages of Data Analysis : Data collection, preprocessing, exploratory data analysis, model building and testing, interpretation, and reporting.

Types of Data Analysis

1. descriptive analysis.

Descriptive analysis focuses on summarizing and describing the features of a dataset. It provides a snapshot of the data, highlighting central tendencies, dispersion, and overall patterns.

  • Central Tendency Measures : Mean, median, and mode are used to identify the central point of the dataset.
  • Dispersion Measures : Range, variance, and standard deviation help in understanding the spread of the data.
  • Frequency Distribution : This shows how often each value in a dataset occurs.

2. Inferential Analysis

Inferential analysis allows researchers to make predictions or inferences about a population based on a sample of data. It is used to test hypotheses and determine the relationships between variables.

  • Hypothesis Testing : Techniques like t-tests, chi-square tests, and ANOVA are used to test assumptions about a population.
  • Regression Analysis : This method examines the relationship between dependent and independent variables.
  • Confidence Intervals : These provide a range of values within which the true population parameter is expected to lie.

3. Exploratory Data Analysis (EDA)

EDA is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. It helps in discovering patterns, spotting anomalies, and checking assumptions with the help of graphical representations.

  • Visual Techniques : Histograms, box plots, scatter plots, and bar charts are commonly used in EDA.
  • Summary Statistics : Basic statistical measures are used to describe the dataset.

4. Predictive Analysis

Predictive analysis uses statistical techniques and machine learning algorithms to predict future outcomes based on historical data.

  • Machine Learning Models : Algorithms like linear regression, decision trees, and neural networks are employed to make predictions.
  • Time Series Analysis : This method analyzes data points collected or recorded at specific time intervals to forecast future trends.

5. Causal Analysis

Causal analysis aims to identify cause-and-effect relationships between variables. It helps in understanding the impact of one variable on another.

  • Experiments : Controlled experiments are designed to test the causality.
  • Quasi-Experimental Designs : These are used when controlled experiments are not feasible.

6. Mechanistic Analysis

Mechanistic analysis seeks to understand the underlying mechanisms or processes that drive observed phenomena. It is common in fields like biology and engineering.

Methods of Data Analysis

1. quantitative methods.

Quantitative methods involve numerical data and statistical analysis to uncover patterns, relationships, and trends.

  • Statistical Analysis : Includes various statistical tests and measures.
  • Mathematical Modeling : Uses mathematical equations to represent relationships among variables.
  • Simulation : Computer-based models simulate real-world processes to predict outcomes.

2. Qualitative Methods

Qualitative methods focus on non-numerical data, such as text, images, and audio, to understand concepts, opinions, or experiences.

  • Content Analysis : Systematic coding and categorizing of textual information.
  • Thematic Analysis : Identifying themes and patterns within qualitative data.
  • Narrative Analysis : Examining the stories or accounts shared by participants.

3. Mixed Methods

Mixed methods combine both quantitative and qualitative approaches to provide a more comprehensive analysis.

  • Sequential Explanatory Design : Quantitative data is collected and analyzed first, followed by qualitative data to explain the quantitative results.
  • Concurrent Triangulation Design : Both qualitative and quantitative data are collected simultaneously but analyzed separately to compare results.

4. Data Mining

Data mining involves exploring large datasets to discover patterns and relationships.

  • Clustering : Grouping data points with similar characteristics.
  • Association Rule Learning : Identifying interesting relations between variables in large databases.
  • Classification : Assigning items to predefined categories based on their attributes.

5. Big Data Analytics

Big data analytics involves analyzing vast amounts of data to uncover hidden patterns, correlations, and other insights.

  • Hadoop and Spark : Frameworks for processing and analyzing large datasets.
  • NoSQL Databases : Designed to handle unstructured data.
  • Machine Learning Algorithms : Used to analyze and predict complex patterns in big data.

Applications and Case Studies

Numerous fields and industries use data analysis methods, which provide insightful information and facilitate data-driven decision-making. The following case studies demonstrate the effectiveness of data analysis in research:

Medical Care:

  • Predicting Patient Readmissions: By using data analysis to create predictive models, healthcare facilities may better identify patients who are at high risk of readmission and implement focused interventions to enhance patient care.
  • Disease Outbreak Analysis: Researchers can monitor and forecast disease outbreaks by examining both historical and current data. This information aids public health authorities in putting preventative and control measures in place.
  • Fraud Detection: To safeguard clients and lessen financial losses, financial institutions use data analysis tools to identify fraudulent transactions and activities.
  • investing Strategies: By using data analysis, quantitative investing models that detect trends in stock prices may be created, assisting investors in optimizing their portfolios and making well-informed choices.
  • Customer Segmentation: Businesses may divide up their client base into discrete groups using data analysis, which makes it possible to launch focused marketing efforts and provide individualized services.
  • Social Media Analytics: By tracking brand sentiment, identifying influencers, and understanding consumer preferences, marketers may develop more successful marketing strategies by analyzing social media data.
  • Predicting Student Performance: By using data analysis tools, educators may identify at-risk children and forecast their performance. This allows them to give individualized learning plans and timely interventions.
  • Education Policy Analysis: Data may be used by researchers to assess the efficacy of policies, initiatives, and programs in education, offering insights for evidence-based decision-making.

Social Science Fields:

  • Opinion mining in politics: By examining public opinion data from news stories and social media platforms, academics and policymakers may get insight into prevailing political opinions and better understand how the public feels about certain topics or candidates.
  • Crime Analysis: Researchers may spot trends, anticipate high-risk locations, and help law enforcement use resources wisely in order to deter and lessen crime by studying crime data.

Data analysis is a crucial step in the research process because it enables companies and researchers to glean insightful information from data. By using diverse analytical methodologies and approaches, scholars may reveal latent patterns, arrive at well-informed conclusions, and tackle intricate research inquiries. Numerous statistical, machine learning, and visualization approaches are among the many data analysis tools available, offering a comprehensive toolbox for addressing a broad variety of research problems.

Data Analysis in Research FAQs:

What are the main phases in the process of analyzing data.

In general, the steps involved in data analysis include gathering data, preparing it, doing exploratory data analysis, constructing and testing models, interpreting the results, and reporting the results. Every stage is essential to guaranteeing the analysis’s efficacy and correctness.

What are the differences between the examination of qualitative and quantitative data?

In order to comprehend and analyze non-numerical data, such text, pictures, or observations, qualitative data analysis often employs content analysis, grounded theory, or ethnography. Comparatively, quantitative data analysis works with numerical data and makes use of statistical methods to identify, deduce, and forecast trends in the data.

What are a few popular statistical methods for analyzing data?

In data analysis, predictive modeling, inferential statistics, and descriptive statistics are often used. While inferential statistics establish assumptions and draw inferences about a wider population, descriptive statistics highlight the fundamental characteristics of the data. To predict unknown values or future events, predictive modeling is used.

In what ways might data analysis methods be used in the healthcare industry?

In the healthcare industry, data analysis may be used to optimize treatment regimens, monitor disease outbreaks, forecast patient readmissions, and enhance patient care. It is also essential for medication development, clinical research, and the creation of healthcare policies.

What difficulties may one encounter while analyzing data?

Answer: Typical problems with data quality include missing values, outliers, and biased samples, all of which may affect how accurate the analysis is. Furthermore, it might be computationally demanding to analyze big and complicated datasets, necessitating certain tools and knowledge. It’s also critical to handle ethical issues, such as data security and privacy.

Please Login to comment...

Similar reads.

  • Data Science Blogathon 2024
  • AI-ML-DS Blogs
  • Data Analysis

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Online First
  • Missing data in emergency care: a pitfall in the interpretation of analysis and research based on electronic patient records
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0003-2736-2784 Timothy J Coats 1 ,
  • http://orcid.org/0000-0003-1474-1734 Evgeny M Mirkes 1 , 2
  • 1 University of Leicester , Leicester , UK
  • 2 School of Computing and Mathematical Sciences , University of Leicester , Leicester , UK
  • Correspondence to Professor Timothy J Coats, University of Leicester, Leicester LE1 7RH, UK; tc61{at}le.ac.uk

Electronic patient records (EPRs) are potentially valuable sources of data for service development or research but often contain large amounts of missing data. Using complete case analysis or imputation of missing data seem like simple solutions, and are increasingly easy to perform in software packages, but can easily distort data and give misleading results if used without an understanding of missingness. So, knowing about patterns of missingness, and when to get expert data science (data engineering and analytics) help, will be a fundamental future skill for emergency physicians. This will maximise the good and minimise the harm of the easy availability of large patient datasets created by the introduction of EPRs.

  • Data Interpretation, Statistical
  • Routinely Collected Health Data

https://doi.org/10.1136/emermed-2024-214097

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Missing data (using this term to mean all the blank cells in a dataset) can create bias, a systematic error that leads to wrong results with the potential to harm patients. In clinical studies the researchers often go to great lengths (and costs) to obtain complete datasets, although many papers still have little or no analysis of missing data. However, when using routinely collected data there is a larger potential for bias from both a high volume of missing data and systematic patterns underlying missingness.

The recent explosion in the use of electronic patient records (EPR) has opened new opportunities in research, quality improvement and clinical care. Data about patients and outcomes will be widely available and no longer the domain of analytics specialists. All emergency physicians will soon have access to large patient datasets on their desktops with powerful data analytical and modelling software packages. This has the potential to transform the way in which clinicians and managers can understand and change healthcare systems and practice. However, like the pitfalls that await the non-specialist when using a powerful statistical package, this new combination of big data, computer power and machine learning analytics/modelling packages also has the potential to give misleading results. 1 2

Emergency care routine datasets contain large numbers of blank cells (missing data), 3 4 but in the past the need for missing data expertise was confined to academic researchers. However, as local EPR-derived datasets become available, all emergency physicians need a better understanding of missing data to critically appraise both local information and research papers. Simple methods of dealing with missing data, such as complete case analysis (only analysing the cases with complete data) or simple imputation (using widely available software to ‘fill in the blanks’), are easy to do in modern analytics packages. However, they are highly prone to introduce bias if used without an awareness of how to analyse missing data and how to minimise potential distortion of conclusions. For example, analysis based on UK NHS data often does not tell us that 5% of cases (which are probably not a random group) are missing due to the NHS data opt-out.

Emergency medicine has been in the forefront of training in critical appraisal and the use of evidence-based medicine. However, our skills in these areas are negated if the analysis that we have positively appraised is based on a dataset with a hidden bias due to missing data. In this review, we will discuss how to recognise missing data, identify its potential impact on the results and use appropriate methods to compensate. (For Glossary see table 1 .)

  • View inline

Types of missingness

There are different patterns of missingness with different potentials for bias and different implications for the way in which the data can be analysed. 5 Missingness may be associated with either observed factors (things that are recorded in the dataset) or unobserved factors (things that are not recorded in the dataset). It is easier to find and compensate for patterns of missingness when they are related to observed factors (because the observed data can inform modelling of the missing data). 6 Missingness can be described as having one of the following patterns:

Missing completely at random ( MCAR ): Here the missing data are completely independent of both observed and unobserved patient characteristics. An example is when staff sometimes just forget to record a patient’s GCS. MCAR data have a low potential to bias the analysis as those with and without a recorded GCS are otherwise similar.

Missing at random ( MAR ): This category can be confusing because the word ‘random’ is used. The data are in fact missing for a reason—but that reason is not related to the data itself (eg, when the ED is very busy staff may not have time to enter a GCS into the EPR). In technical terms, the property ‘missing’ is not related to the missing value (and so appears to be random when you look at the data field that you are interested in). MAR data can be difficult to identify and can bias analysis. However, if information has been collected about the factor that caused the data to be missing (eg, if the ‘busyness’ of the ED was recorded in the GCS example) then this can be identified and adjusted for in the analysis.

Missing not at random ( MNAR ): In this case missing data are related to the missing value itself. For example, if a ‘minors’ patient is walking and talking ED staff may not record a GCS, as the patient is obviously well. The reason why the data were not recorded (cause of the missingness) is related to the missing variable itself (the GCS was normal). MNAR data are important to recognise as it causes bias and is difficult to adjust for in the analysis. For example, if all the high GCS patients are missing from a dataset, any imputation of GCS will cause bias, as the imputation algorithm has no data about high GCS patients.

Obligatory absent data: For example, variable ‘Time to operation’ must be absent for all patients who did not have an operation. This type of absent data is also called ‘Data which do not exist’ or ‘Data which must be missing’. This type is important as any attempts to impute these missing values will distort the input dataset and must not be done (eg, imputing the pregnancy test results for cis-gendered men will make the dataset non-sense). In one sense, this type of data is not missing (as it cannot exist)—but the blank cell created in the dataset presents the same issues for big data analytics as a blank cell due to missing data.

Dark data 7 : These are data which are not known to be missing (eg, the patients who have used the NHS data opt-out will not be present in the dataset). This type of missing data cannot be seen and cannot be assessed, so any impact cannot be known and no adjustment to the analysis can be made. The NHS data opt-out is about 5%, but varies widely across the country, giving the potential to create misleading results for both service analysis and research, without any indication that this has happened. 8 This bias could occur even if the very best methods are used in the analysis within a publication that seems of the highest quality on critical appraisal.

Missing data in EPRs

The clinical dataset underlying an EPR can be thought of as a table with a row for each patient and a column for each variable in the EPR (this is an oversimplified but useful description). There are many reasons why data might be missing in a routinely collected ED EPR dataset and depending on the reason the missing data can be classified into one of the above categories.

Patient too sick—staff cannot prioritise recording data—MNAR.

Patient too well—staff think recording data not relevant—MNAR.

Staff too busy—no time for data recording—MAR if the variable does not relate to how busy the ED is (such as the patient’s age); however, missing variables related to how busy the department is (eg, time to triage or administration of a drug) may be subject to bias due to MNAR.

Data not available for staff to record—potential for bias depends on reason for non-availability, for example, low potential from random breakdown of near patient testing (MCAR), but a higher potential (MAR) if there is a pattern to the machine breakdown (such as no technician at night).

Data not relevant to the patient—there are many thousands of potential tests and interventions within a medical dataset, but any one patient will only undergo a small subset. This means that the cells in the dataset relating to all of the non-performed tests or procedures will be blank for that patient (anecdotally, hospital datasets have more than 90% missing data—because the vast majority of data fields are not relevant to a particular patient). These data points are absent (as they were not generated) or obligatory missing, but as noted above this blank cell in a dataset presents the same issues for data analytics as a blank cell due to other forms of missing data.

Staff not engaged—data recording requires additional work with no immediate benefit. For example, in the Emergency Care Data Set (ECDS) many clinicians simply code the main factor (such as main diagnosis or main comorbidity) rather than all of the details. This means that other information is missing in the dataset (which may be MAR or MNAR depending on the relationship between the recorded and missing diagnoses).

Patients not willing—some groups of patients may be less likely to communicate information—a complex interaction of social, age-related, societal, ethnic and gender-based factors giving MAR data.

Temporal change in data structure—patient datasets continuously evolve as changes in healthcare create changes in the data structure, such as the inclusion of a new test or other piece of information (eg, a frailty score is a relatively recent addition in the ECDS). This means that all cells for this variable in the database before the change will be blank. A good example of temporal change is a move from one EPR to another—which may have a different data structure. This is obligatory missing data. Again, it could be argued that these data are not missing as it was never recorded, but the same issues arise for big data analytics.

Withdrawn consent for data use—5.4% of NHS patients in the UK have opted out of some uses of their data. This ‘case deletion’ is not a random process and so may bias the remaining data; this is Dark Data as there is no way of knowing that the data are missing.

Deliberate manipulation of data—deletion of data through hacking or other malicious intent. This could be MCAR if the hack was completely non-discriminatory or MNAR if the hacker specifically deleted certain information.

Data loss—in the complexity of healthcare data systems there is the potential for corruption or loss of data during processes such as transfer of data, backup, merging legacy systems or moving between EPRs. This may be completely at random (MCAR), but it is more likely that the data lost relate to a specific time period or type of data and would then be considered MNAR and therefore a potential cause of bias.

How to understand your missing data

To understand the patterns of missingness, a missing data analysis needs to be performed before the data are analysed. This involves:

Examination of the data to look at the amount of missing data for each variable. Also understand how missingness is distributed across the cases. For example, a dataset with 5% missing data could be many cases with a little missing data or a few cases with a lot of missing data.

Developing an understanding of the meaning of the missing values. This may involve discussion with healthcare staff who understand the data collection process and data engineers who understand how the dataset was curated and extracted. These people often have knowledge that enables you to understand why some data are missing, for example, a server failure on a particular day or the introduction of new data fields due to the change in the type of analyser used for near patient testing.

Understanding if some of the data should be missing, and if so, deciding how to handle this obligatory missingness in the analysis/modelling. For example, if pregnancy is an important risk factor, any modelling (either statistical or artificial intelligence) will struggle with the missing data for male pregnancy tests. So, the problem can be avoided by understanding why these data are missing and deciding if some data engineering might be appropriate before analysis (such as creating a new field of ‘Yes’ or ‘No’ for pregnancy in all patients, which will not have missing data).

An exploratory analysis to look for patterns of non-random dependences within the missing data:

For each variable divide the patients into those with missing and known data and tabulate the characteristics of each group (all other variables). Uneven distribution of other variables between groups means that that there is more likely to be some systematic missingness (all other variables should be evenly distributed between the groups if data are MAR).

Test whether properties that are missing are dependent on other variables in the dataset. For example, to test the randomness of missing data in variable A in relation to variable B two groups of patients can be created depending on whether variable A is ‘known’ or ‘missing’. If there is a random association between the missing data in variable A and variable B, the means and distribution of variable B will be the same in both groups. A t-test of means equality 9 can be used to make this comparison (the exact comparison depends on the proportion of missing data). A Mann-Whitney U test should be used if variable B is not normally distributed. However, it can still be difficult to find and compensate for patterns of missingness when they are related to observed factors, for example, if the relationship is highly complex (eg, if the data are missing due to an interaction of age, ethnicity and social deprivation). There is also the danger that using this approach for very large datasets might detect statistically significant but very small differences between A and B, which are too small to impact on findings. Similarly, small datasets may fail to detect differences between A and B which are meaningful.

Evaluate whether missing data in one variable (A) are dependent on missing data in another variable (B). To do this, new binary variables ‘MA’ and ‘MB’ are created depending on the presence/absence of missing data for each variable and a χ 2 test is used to assess the interdependence of missingness between the variables.

Evaluate missing outcome data. There is the potential to model outcomes 4 to substitute for missing outcomes, but handling missing outcome data is complex and can easily lead to error, for example, if all of the dead patients are missing from the dataset any attempt to model outcome (lived or died) for these patients will be misleading. Modelling of missing outcomes needs specialist advice due to the high chance of introducing bias.

How to read a report/paper

There are a series of questions which can help to critically appraise the quality of the analysis of missing data in a report or publication:

Does the Methods section describe the analysis of missing data?

Is there a table of missing data?

Is there an exploratory analysis of missing data?

Has the assumed classification of missing data been specified (MNAR, MAR, MCAR, etc)?

Is there a detailed description of all data engineering?

Is there a detailed description of how the analysis has been adjusted for missing data?

EPR datasets are becoming widely available, creating the potential for both benefit and risk in emergency care. Emergency physicians using routinely collected data for either service development or research need to understand the potential for missing data to cause wrong conclusions due to bias.

There are many different patterns and reasons for missing data in real-life emergency care datasets. An analysis of missingness is a vital first step in the process of understanding the potential for bias, which informs the selection of the best methods for analysis and guides the interpretation of results. Whether performing an analysis to inform your own ED or reading a paper based on routinely collected data it is essential to evaluate the potential impact of missing data on the conclusions.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

  • Nowacki AS ,
  • Chagin K , et al
  • Zhaohui S ,
  • Coats TJ , et al
  • Mirkes EM ,
  • Levesley J , et al
  • Emmanuel T ,
  • Maupong T ,
  • Mpoeleng D , et al

Permission Part of this paper is reproduced with permission from a previous article from conference proceedings: N. Suzen, E. M. Mirkes, D. Roland, J. Levesley, A. N. Gorban and T. J. Coats, "What is Hiding in Medicine’s Dark Matter? Learning with Missing Data in Medical Practices," 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 2023, pp. 4979-4986, doi: 10.1109/BigData59044.2023.10386194. Published out of sequence, the conference proceedings are a practical application of the concepts developed by Coats and Mirkes in this paper.

Handling editor Richard Body

Contributors TJC and EMM contributed equally to the concept, drafting and reviewing of this work and have agreed to the final manuscript. TJC is the guarantor of the work. The EMJ editors and reviewers of the manuscript made comments assisting in the revisions.

Funding This work was supported by the Health Foundation (Grant No 1747259).

Competing interests None declared.

Provenance and peer review Not commissioned; internally peer reviewed.

Read the full text or download the PDF:

medRxiv

OpenSAFELY: Effectiveness of COVID-19 vaccination in children and adolescents

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Colm D Andrews
  • For correspondence: [email protected]
  • ORCID record for Edward P K Parker
  • ORCID record for Andrea L Schaffer
  • ORCID record for Amelia CA Green
  • ORCID record for Helen J Curtis
  • ORCID record for Alex J Walker
  • ORCID record for Lucy Bridges
  • ORCID record for Christopher Wood
  • ORCID record for Christopher Bates
  • ORCID record for Jonathan Cockburn
  • ORCID record for Amir Mehrkar
  • ORCID record for Brian MacKenna
  • ORCID record for Sebastian CJ Bacon
  • ORCID record for Ben Goldacre
  • ORCID record for Jonathan AC Sterne
  • ORCID record for William J Hulme
  • Info/History
  • Supplementary material
  • Preview PDF

Background Children and adolescents in England were offered BNT162b2 as part of the national COVID-19 vaccine roll out from September 2021. We assessed the safety and effectiveness of first and second dose BNT162b2 COVID-19 vaccination in children and adolescents in England.

Methods With the approval of NHS England, we conducted an observational study in the OpenSAFELY-TPP database, including a) adolescents aged 12-15 years, and b) children aged 5-11 years and comparing individuals receiving i) first vaccination with unvaccinated controls and ii) second vaccination to single-vaccinated controls. We matched vaccinated individuals with controls on age, sex, region, and other important characteristics. Outcomes were positive SARS-CoV-2 test (adolescents only); COVID-19 A&E attendance; COVID-19 hospitalisation; COVID-19 critical care admission; COVID-19 death, with non-COVID-19 death and fractures as negative control outcomes and A&E attendance, unplanned hospitalisation, pericarditis, and myocarditis as safety outcomes.

Results Amongst 820,926 previously unvaccinated adolescents, the incidence rate ratio (IRR) for positive SARS-CoV-2 test comparing vaccination with no vaccination was 0.74 (95% CI 0.72-0.75), although the 20-week risks were similar. The IRRs were 0.60 (0.37-0.97) for COVID-19 A&E attendance, 0.58 (0.38-0.89) for COVID-19 hospitalisation, 0.99 (0.93-1.06) for fractures, 0.89 (0.87-0.91) for A&E attendances and 0.88 (0.81-0.95) for unplanned hospitalisation. Amongst 441,858 adolescents who had received first vaccination IRRs comparing second dose with first dose only were 0.67 (0.65-0.69) for positive SARS-CoV-2 test, 1.00 (0.20-4.96) for COVID-19 A&E attendance, 0.60 (0.26-1.37) for COVID-19 hospitalisation, 0.94 (0.84-1.05) for fractures, 0.93 (0.89-0.98) for A&E attendance and 0.99 (0.86-1.13) for unplanned hospitalisation. Amongst 283,422 previously unvaccinated children and 132,462 children who had received a first vaccine dose, COVID-19-related outcomes were too rare to allow IRRs to be estimated precisely. A&E attendance and unplanned hospitalisation were slightly higher after first vaccination (IRRs versus no vaccination 1.05 (1.01-1.10) and 1.10 (0.95-1.26) respectively) but slightly lower after second vaccination (IRRs versus first dose 0.95 (0.86-1.05) and 0.78 (0.56-1.08) respectively). There were no COVID-19-related deaths in any group. Fewer than seven (exact number redacted) COVID-19-related critical care admissions occurred in the adolescent first dose vs unvaccinated cohort. Among both adolescents and children, myocarditis and pericarditis were documented only in the vaccinated groups, with rates of 27 and 10 cases/million after first and second doses respectively.

Conclusion BNT162b2 vaccination in adolescents reduced COVID-19 A&E attendance and hospitalisation, although these outcomes were rare. Protection against positive SARS-CoV-2 tests was transient.

Competing Interest Statement

BG has received research funding from the Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, NHS England, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK, the Health Foundation, the World Health Organisation, UKRI MRC, Asthma UK, the British Lung Foundation, and the Longitudinal Health and Wellbeing strand of the National Core Studies programme; he is a Non-Executive Director at NHS Digital; he also receives personal income from speaking and writing for lay audiences on the misuse of science. BMK is also employed by NHS England working on medicines policy and clinical lead for primary care medicines data. IJD has received unrestricted research grants and holds shares in GlaxoSmithKline (GSK).

Funding Statement

The OpenSAFELY Platform is supported by grants from the Wellcome Trust (222097/Z/20/Z); MRC (MR/V015757/1, MC_PC-20059, MR/W016729/1); NIHR (NIHR135559, COV-LT2-0073), and Health Data Research UK (HDRUK2021.000, 2021.0157). In addition, this research used data assets made available as part of the Data and Connectivity National Core Study, led by Health Data Research UK in partnership with the Office for National Statistics and funded by UK Research and Innovation (grant ref MC_PC_20058). BG has also received funding from: the Bennett Foundation, the Wellcome Trust, NIHR Oxford Biomedical Research Centre, NIHR Applied Research Collaboration Oxford and Thames Valley, the Mohn-Westlake Foundation; all Bennett Institute staff are supported by BG's grants on this work. The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, UK Health Security Agency (UKHSA) or the Department of Health and Social Care.

Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the London School of Hygeine and Tropical Medicine Ethics Board (reference 21863).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Data Availability

All data were linked, stored and analysed securely using the OpenSAFELY platform, https://www.opensafely.org/ , as part of the NHS England OpenSAFELY COVID-19 service. Data include pseudonymised data such as coded diagnoses, medications and physiological parameters. No free text data was included. All code is shared openly for review and re-use under MIT open license [ https://github.com/opensafely/vaccine-effectiveness-in-kids ]. Detailed pseudonymised patient data is potentially re-identifiable and therefore not shared. Primary care records managed by the GP software provider, TPP were linked to ONS death data and the Index of Multiple Deprivation through OpenSAFELY.

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Primary Care Research
  • Addiction Medicine (325)
  • Allergy and Immunology (637)
  • Anesthesia (169)
  • Cardiovascular Medicine (2424)
  • Dentistry and Oral Medicine (292)
  • Dermatology (208)
  • Emergency Medicine (382)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (859)
  • Epidemiology (11852)
  • Forensic Medicine (10)
  • Gastroenterology (706)
  • Genetic and Genomic Medicine (3805)
  • Geriatric Medicine (353)
  • Health Economics (642)
  • Health Informatics (2441)
  • Health Policy (944)
  • Health Systems and Quality Improvement (914)
  • Hematology (345)
  • HIV/AIDS (799)
  • Infectious Diseases (except HIV/AIDS) (13383)
  • Intensive Care and Critical Care Medicine (772)
  • Medical Education (374)
  • Medical Ethics (105)
  • Nephrology (406)
  • Neurology (3560)
  • Nursing (201)
  • Nutrition (534)
  • Obstetrics and Gynecology (688)
  • Occupational and Environmental Health (673)
  • Oncology (1846)
  • Ophthalmology (542)
  • Orthopedics (224)
  • Otolaryngology (291)
  • Pain Medicine (234)
  • Palliative Medicine (68)
  • Pathology (452)
  • Pediatrics (1048)
  • Pharmacology and Therapeutics (432)
  • Primary Care Research (425)
  • Psychiatry and Clinical Psychology (3221)
  • Public and Global Health (6223)
  • Radiology and Imaging (1305)
  • Rehabilitation Medicine and Physical Therapy (759)
  • Respiratory Medicine (840)
  • Rheumatology (383)
  • Sexual and Reproductive Health (377)
  • Sports Medicine (328)
  • Surgery (412)
  • Toxicology (51)
  • Transplantation (174)
  • Urology (148)

'ZDNET Recommends': What exactly does it mean?

ZDNET's recommendations are based on many hours of testing, research, and comparison shopping. We gather data from the best available sources, including vendor and retailer listings as well as other relevant and independent reviews sites. And we pore over customer reviews to find out what matters to real people who already own and use the products and services we’re assessing.

When you click through from our site to a retailer and buy a product or service, we may earn affiliate commissions. This helps support our work, but does not affect what we cover or how, and it does not affect the price you pay. Neither ZDNET nor the author are compensated for these independent reviews. Indeed, we follow strict guidelines that ensure our editorial content is never influenced by advertisers.

ZDNET's editorial team writes on behalf of you, our reader. Our goal is to deliver the most accurate information and the most knowledgeable advice possible in order to help you make smarter buying decisions on tech gear and a wide array of products and services. Our editors thoroughly review and fact-check every article to ensure that our content meets the highest standards. If we have made an error or published misleading information, we will correct or clarify the article. If you see inaccuracies in our content, please report the mistake via this form .

How to use ChatGPT to make charts and tables with Advanced Data Analysis

david-gewirtz

Know what floats my boat? Charts and graphs.

Give me a cool chart to dig into and I'm unreasonably happy. I love watching the news on election nights, not for the vote count, but for all the great charts. I switch between channels all evening to see every possible way that each network finds to present numerical data. 

Is that weird? I don't think so.

Also:  The moment I realized ChatGPT Plus was a game-changer for my business

As it turns out, ChatGPT does a great job making charts and tables. And given that this ubiquitous generative AI chatbot can synthesize a ton of information into something chart-worthy, what ChatGPT gives up in pretty presentation it more than makes up for in informational value.

It should come as no surprise to anybody that AI chatbots' feature sets are changing constantly. As of the time of this update (end of May, 2024), OpenAI has just come out with a Mac application and has release its GPT-4o LLM, which is available for both free and paying customers. The GPT-4o version that comes for the added-price Plus version is supposed to have interactive chart features and the ability to interact with the engine longer per session.

But, not so much. My free account doesn't offer GPT-4o at all yet. It hasn't rolled out to all free accounts yet. And while paid ChatGPT Plus plan does provide the interactive charts feature in Chrome and Safari, it doesn't in the Mac app.

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

This article was last updated when the Advanced Data Analysis features (which included charts) were only available to Plus customers. Even though some of those features are supposed to be available to free customers, since my free account doesn't have them yet, I'm going to present the rest of this article as if the charting features are only available to Plus customers. If you're a free customer and you have GPT-4o, feel free to try some of the prompts. Those features may work for you, and undoubtely will as we move forward in time.

Advanced Data Analysis produces relatively ugly charts. But it rocks. First, let's discuss where ChatGPT gets its data, then we'll make some tables.

How to use ChatGPT to make charts and tables

1. understand the different versions of chatgpt.

Earlier, we talked about which charting tools are available in which versions of ChatGPT. But there's more to it than simply charting tools. If you want to use ChatGPT productively, you need to understand what the various editions can do.

ChatGPT free version:  This version has historically used the GPT-3.5 large language model (LLM), which isn't quite as capable as the  GPT-4 version . As of May 2024, the GPT-4o LLM is also available to some free users and rolling out over time.

ChatGPT Plus: ChatGPT Plus is OpenAI's commercial, fully powered version of ChatGPT. Right now, ChatGPT Plus provides three major selection options per session: GPT-3.5, GPT-4, and GPT-4o. It used to offer plugins, but they've been replaced by custom GPTs .

The GPT-4 and GPT-4o versions now include DALL-E 3, Bing Web access, and Advanced Data Analysis. Some users have reported some difficulty with using Bing for web access. Most of what we will be doing is using the Advanced Data Analysis component. Even without Bing web access, GPT-4 and 4o report that training data now includes information up to December 2023. 

Also: What does GPT stand for? Understanding GPT 3.5, GPT 4, GPT-4o, and more

For much of this article, we will be using the Advanced Data Analysis component of the GPT-4 option. This tool will import data tables in a wide range of file formats. While it doesn't specify a size limit for imported data, it can handle fairly large files, but will break if the files exceed some undefined level of complexity.

As ChatGPT Plus changes, and it will, we will update you with more information. For now, let's just look at making some cool charts.

ChatGPT Enterprise:  Advanced Data Analytics and plugins are also available in the  enterprise version . You can upload files to Enterprise, and they will remain confidential. Enterprise is also supposed to allow for bigger files and bigger responses. Pricing has not been specified.

2. Create a basic table

Let's start with an example. For the following demonstration, we'll be working with the top five cities in terms of population.

List the top five cities in the world by population. Include country.

I asked this question to ChatGPT's free version and here's what I got back:

Turning that data into a table is simple. Just tell ChatGPT you want a table:

Make a table of the top five cities in the world by population. Include country.

3. Manipulate the table

You can manipulate and customize a table by giving ChatGPT more detailed instructions. Again, using the free version, we'll add a population count field. Of course, that data is out of date, but it's presented anyway:

Make a table of the top five cities in the world by population. Include country and a population field

You can also specify certain details for the table, like field order and units. Here, I'm moving the country first and compressing the population numbers.

Make a table of the top five cities in the world by population. Include country and a population field. Display the fields in the order of rank, country, city, population. Display population in millions (with one decimal point), so 37,833,000 would display as 37.8M.

Note that I gave the AI an example of how I wanted the numbers to display.

That's about as far as the free version will take us. From now on, we're switching to the $20/month ChatGPT Plus version .

4. Create a bar chart

ChatGPT Plus with Advanced Data Analytics enabled can make line charts, bar charts, histograms, pie charts, scatter plots, heatmaps, box plots, area charts, bubble charts, Gantt charts, Pareto charts, network diagrams, Sankey diagrams, choropleth maps, radar charts, word clouds, treemaps, and 3D charts.

In this example, we're just going to make a simple bar chart.

Make a bar chart of the top five cities in the world by population

Chatty little tool, isn't it?

The eagle-eyed among you may have noticed the discrepancy in populations between the previous table shown and the results here. Notice that the table has a green icon and this graph has a purple icon. We've jumped from GPT-3.5 (the free version of ChatGPT) to GPT-4 (in ChatGPT Plus). It's interesting that the differing LLMs have slightly different data. This difference is all part of why it pays to be careful when using AIs, so double-check your work. In our case, we're just demonstrating charts, but this is a tangible example of where confidently presented data can be wrong or inconsistent.  

5. Upload data

One of Advanced Data Analytics' superpowers is the ability to upload a dataset. For our example, I downloaded the  Popular Baby Names dataset  from  Data.gov . This is a comma-separated file of New York City baby names from 2011-2014. Even though it's a decade out of date, it's fun to play with.

The dataset I chose for this article is readily available from a government site, so you can replicate this experiment on your own. There are a ton of great datasets available on Data.gov , but I found that many are far too large for ChatGPT to use. 

Also:  How to use ChatGPT to create an app

Once I downloaded this one, I realized it also included information on ethnicity, so we can run a number of different charts from the same dataset.

Click the little upload button and then tell it the data file you want to import.

I asked it to show me the first five lines of the file so I'd know more about the file's format.

6. Create a pie chart (and change colors)

I was curious about how the dataset distributed gender names. Here's my first prompt:

Create a pie chart showing gender as a percentage of the overall dataset

And here's the result:

Unfortunately, the dark shade of green makes the numbers difficult to read. Fortunately, you can instruct Advanced Data Analytics to use different colors. I was careful to choose colors that did not reinforce gender stereotypes.

Create a pie chart showing gender as a percentage of the overall dataset. Use light green for male and medium yellow for female.

7. Normalize data for accuracy

As we saw earlier, the data collected includes ethnicity. Here's how to see the distribution of the various ethnicities New York recorded in the early 2010s:

Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.

And here's the result. Notice anything?

Apparently, New York didn't properly normalize its data. It used "WHITE NON HISPANIC" and "WHITE NON HISP" together, "BLACK NON HISPANIC" and "BLACK NON HISP" together, and "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI" together. This resulted in inaccurate representations of the data.

One benefit of ChatGPT is it remembers instructions throughout a session. So I was able to give it this instruction:

For all the following requests, group "WHITE NON HISPANIC" and "WHITE NON HISP" together. Group "BLACK NON HISPANIC" and "BLACK NON HISP" together. Group "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI". Use the longer of the two ethnicity names when displaying ethnicity.

And it replied:

Let's try the chart again, using the same prompt.

That's better:

You need to be diligent when looking at results. For example, in a request for top baby names, the AI separated out "Madison" and "MADISON" as two different names:

For all the following requests, baby names should be case insensitive.

8. Export your graphics

Let's wrap up with a complex chart from one prompt. Here's our prompt:

For each ethnicity, present two pie charts, one for each gender. Each pie chart should list the top five baby names for that gender and that ethnicity. Use only light colors.

As it turns out, the chart generated text that was too small to read. So, to get a more useful chart, we can export it back out. I'm going to specify both file format and file width:

Export this chart as a 3000 pixel wide JPG file.

Notice that Sofia and Sophia are very popular, but are shown as two different names. But that's what makes charts so fascinating.

How much does it cost to use Advanced Data Analytics?

Advanced Data Analytics comes with ChatGPT Plus. Some of its features are available in GPT-4o for the free version of ChatGPT. ChatGPT Plus is $20/month. Advanced Data Analytics also is included with the Enterprise edition, but pricing for that hasn't been released yet.

Is the data uploaded to ChatGPT for charting kept private or is there a risk of data exposure?

Assume that there's always a privacy risk.

I asked this question to ChatGPT and this is what it told me: 

Data privacy is a priority for ChatGPT. Uploaded data is used solely for the purpose of the user's current session and is not stored long-term or used for any other purposes. However, for highly sensitive data, users should always exercise caution and consider using the Enterprise version of ChatGPT, which offers enhanced data confidentiality.

Also: Generative AI brings new risks to everyone. Here's how you can stay safe

My recommendation: Don't trust ChatGPT or any generative AI tool. The Enterprise version is supposed to have more privacy controls, but I would recommend you only upload data that you won't mind finding its way to public visibility.

Can ChatGPT's Advanced Data Analysis handle real-time data or is it more suited for static datasets?

It's possible, but there are some practical limitations. First, the Plus account will throttle the number of requests you can make in a given period of time. Second, you have to upload each file individually. There is the possibility you could use a licensed ChatGPT API to do real-time analytics. But for the chatbot itself, you're looking at parsing data at rest.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack , and follow me on Twitter at @DavidGewirtz , on Facebook at Facebook.com/DavidGewirtz , on Instagram at Instagram.com/DavidGewirtz , and on YouTube at YouTube.com/DavidGewirtzTV .

Artificial Intelligence

Openai just gave free chatgpt users browsing, data analysis, and more, the best vpn for gaming: expert tested, the best satellite phones you can buy: expert tested.

how to write analysis and interpretation of data in research

Salesforce is closed for new business in your area.

IMAGES

  1. SOLUTION: Thesis chapter 4 analysis and interpretation of data sample

    how to write analysis and interpretation of data in research

  2. 5 Steps of the Data Analysis Process

    how to write analysis and interpretation of data in research

  3. (PDF) CHAPTER FOUR DATA ANALYSIS AND PRESENTATION OF RESEARCH FINDINGS

    how to write analysis and interpretation of data in research

  4. FREE 13+ Research Analysis Samples in MS Word

    how to write analysis and interpretation of data in research

  5. SOLUTION: Thesis chapter 4 analysis and interpretation of data sample

    how to write analysis and interpretation of data in research

  6. DATA ANALYSIS AND INTERPRETATION In Research Methodology, Data Analysis and Interpretation Types.PDF

    how to write analysis and interpretation of data in research

VIDEO

  1. Data Analysis in Research

  2. HOW TO READ and ANALYZE A RESEARCH STUDY

  3. How to present research tools, procedures and data analysis techniques

  4. Analysis of Data? Some Examples to Explore

  5. Data Analysis [Video 6]

  6. Data Interpretation

COMMENTS

  1. Data Interpretation: Definition and Steps with Examples

    Data interpretation is the process of reviewing data and arriving at relevant conclusions using various analytical research methods. Data analysis assists researchers in categorizing, manipulating data, and summarizing data to answer critical questions. LEARN ABOUT: Level of Analysis.

  2. Data Interpretation

    The purpose of data interpretation is to make sense of complex data by analyzing and drawing insights from it. The process of data interpretation involves identifying patterns and trends, making comparisons, and drawing conclusions based on the data. The ultimate goal of data interpretation is to use the insights gained from the analysis to ...

  3. What Is Data Interpretation? Meaning & Analysis Examples

    2. Brand Analysis Dashboard. Next, in our list of data interpretation examples, we have a template that shows the answers to a survey on awareness for Brand D. The sample size is listed on top to get a perspective of the data, which is represented using interactive charts and graphs. **click to enlarge**.

  4. A practical guide to data analysis in general literature reviews

    This article is a practical guide to conducting data analysis in general literature reviews. The general literature review is a synthesis and analysis of published research on a relevant clinical issue, and is a common format for academic theses at the bachelor's and master's levels in nursing, physiotherapy, occupational therapy, public health and other related fields.

  5. The Beginner's Guide to Statistical Analysis

    Step 1: Write your hypotheses and plan your research design. To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design. Writing statistical hypotheses. The goal of research is often to investigate a relationship between variables within a population. You start with a prediction ...

  6. A Really Simple Guide to Quantitative Data Analysis

    It is important to know w hat kind of data you are planning to collect or analyse as this w ill. affect your analysis method. A 12 step approach to quantitative data analysis. Step 1: Start with ...

  7. A Practical Guide to Writing Quantitative and Qualitative Research

    A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. ... of the study meant to address the problem posed in the research question.1 An excellent research question clarifies the research writing while facilitating understanding of the ...

  8. Creating a Data Analysis Plan: What to Consider When Choosing

    INTRODUCTION. Statistics represent an essential part of a study because, regardless of the study design, investigators need to summarize the collected information for interpretation and presentation to others. It is therefore important for us to heed Mr Twain's concern when creating the data analysis plan. In fact, even before data collection ...

  9. 30 Interpretation Strategies: Appropriate Concepts

    Qualitative researchers and those writing about qualitative methods often intertwine the terms analysis and interpretation. For example, Hubbard and Power (2003) describe data analysis as, "bringing order, structure, and meaning to the data" (p. 88). To us, this description combines analysis with interpretation.

  10. How to Write a Results Section

    Here are a few best practices: Your results should always be written in the past tense. While the length of this section depends on how much data you collected and analyzed, it should be written as concisely as possible. Only include results that are directly relevant to answering your research questions.

  11. The Library: Research Skills: Analysing and Presenting Data

    Overview. Data analysis is an ongoing process that should occur throughout your research project. Suitable data-analysis methods must be selected when you write your research proposal. The nature of your data (i.e. quantitative or qualitative) will be influenced by your research design and purpose. The data will also influence the analysis ...

  12. PDF Chapter 4: Analysis and Interpretation of Results

    The analysis and interpretation of data is carried out in two phases. The. first part, which is based on the results of the questionnaire, deals with a quantitative. analysis of data. The second, which is based on the results of the interview and focus group. discussions, is a qualitative interpretation.

  13. Learning to Do Qualitative Data Analysis: A Starting Point

    For many researchers unfamiliar with qualitative research, determining how to conduct qualitative analyses is often quite challenging. Part of this challenge is due to the seemingly limitless approaches that a qualitative researcher might leverage, as well as simply learning to think like a qualitative researcher when analyzing data. From framework analysis (Ritchie & Spencer, 1994) to content ...

  14. Qualitative Data Analysis: Step-by-Step Guide (Manual vs ...

    Step 1: Gather your qualitative data and conduct research (Conduct qualitative research) The first step of qualitative research is to do data collection. Put simply, data collection is gathering all of your data for analysis. A common situation is when qualitative data is spread across various sources.

  15. PDF Structure of a Data Analysis Report

    - Data - Methods - Analysis - Results This format is very familiar to those who have written psych research papers. It often works well for a data analysis paper as well, though one problem with it is that the Methods section often sounds like a bit of a stretch: In a psych research paper the Methods section describes what you did to ...

  16. Research Guide: Data analysis and reporting findings

    Data analysis is the most crucial part of any research. Data analysis summarizes collected data. It involves the interpretation of data gathered through the use of analytical and logical reasoning to determine patterns, relationships or trends.

  17. (PDF) Qualitative Data Analysis and Interpretation: Systematic Search

    Qualitative data analysis is. concerned with transforming raw data by searching, evaluating, recogni sing, cod ing, mapping, exploring and describing patterns, trends, themes an d categories in ...

  18. Chapter Four Data Presentation, Analysis and Interpretation 4.0

    DATA PRESENTATION, ANALYSIS AND INTERPRETATION. 4.0 Introduction. This chapter is concerned with data pres entation, of the findings obtained through the study. The. findings are presented in ...

  19. Analyzing and Interpreting Data

    Interpreting the Confidence Interval. Meaning of a confidence interval. A CI can be regarded as the range of values consistent with the data in a study. Suppose a study conducted locally yields an RR of 4.0 for the association between intravenous drug use and disease X; the 95% CI ranges from 3.0 to 5.3.

  20. 10.5 Analysis of Qualitative Interview Data

    10.5 Analysis of Qualitative Interview Data Analysis of qualitative interview data typically begins with a set of transcripts of the interviews conducted. Obtaining said transcripts requires either having taken exceptionally good notes during an interview or, preferably, recorded the interview and then transcribed it.

  21. PDF Chapter 6: Data Analysis and Interpretation 6.1. Introduction

    recommendations (cf. Chap. 8). The focus now turns to the analysis and interpretation of the data for this study. 6.2 ANALYSIS AND INTERPRETATION OF DATA Marshall and Rossman(1999:150) describe data analysis as the process of bringing order, structure and meaning to the mass of collected data. It is described as messy, ambiguous and

  22. Data Analysis in Research: Types & Methods

    Main Goals: Describe data, make inferences, predict future events, and provide data-driven recommendations.; Stages of Data Analysis: Data collection, preprocessing, exploratory data analysis, model building and testing, interpretation, and reporting.; Types of Data Analysis 1. Descriptive Analysis. Descriptive analysis focuses on summarizing and describing the features of a dataset.

  23. Missing data in emergency care: a pitfall in the interpretation of

    Electronic patient records (EPRs) are potentially valuable sources of data for service development or research but often contain large amounts of missing data. Using complete case analysis or imputation of missing data seem like simple solutions, and are increasingly easy to perform in software packages, but can easily distort data and give misleading results if used without an understanding ...

  24. Evaluation of Bitemark Analysis's Potential Application in Forensic

    Bitemark analysis involves the examination of both patterned injuries and contextual circumstances, combining morphological and positional data. Considering the uniqueness of human dentition, bitemarks caused by teeth on skin or impressions on flexible surfaces could assist in human identification. Aims: to investigate the available literature systematically and evaluate the scientific ...

  25. OpenSAFELY: Effectiveness of COVID-19 vaccination in children and

    Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. Author Declarations. I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes

  26. How to use ChatGPT to make charts and tables

    The GPT-4 and GPT-4o versions now include DALL-E 3, Bing Web access, and Advanced Data Analysis. Some users have reported some difficulty with using Bing for web access.

  27. Comparative Performance Study of Dissolved Gas Analysis (DGA) Methods

    1. Introduction. Power transformer is the key component of an electrical network system that can be used to step up and step down voltage. The health condition of power transformer mainly depends on the insulation oil during normal operation [].The intensive care and analysis of insulation condition of power transformers are essential to make certain the consistent operation.

  28. Interpretation of Hot Spots in Wuhan New Town Development and Analysis

    The construction of new towns is one of the main measures to evacuate urban populations and promote regional coordination and urban-rural integration in China. Mining the spatio-temporal pattern of new town hot spots based on multivariate data and analyzing the influencing factors of new town construction hot spots can provide a strategic basis for new town construction, but few researchers ...

  29. Generative AI Statistics for 2024

    Salesforce's research showed most C-suite leaders (83%) claim they know how to use generative AI while keeping data secure — compared to only 29% of individual contributors. Clearly, companies have a gap to close when it comes to ensuring generative AI is effectively adopted and used. Fortunately, the data offers some insight.