What is internal validity?
Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.
Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.
Methodology refers to the overarching strategy and rationale of your research project. It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.
Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys, and statistical tests).
In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section.
In a longer or more complex research project, such as a thesis or dissertation, you will probably include a methodology section, where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.
Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.
Quantitative methods allow you to systematically measure variables and test hypotheses. Qualitative methods allow you to explore concepts and experiences in more detail.
A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.
In statistics, sampling allows you to test a hypothesis about the characteristics of a population.
Reliability and validity are both about how well a method measures something:
If you are doing experimental research, you also have to consider the internal and external validity of your experiment.
Internal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables.
External validity is the extent to which your results can be generalized to other contexts.
The validity of your experiment depends on your experimental design.
Experimental design means planning a set of procedures to investigate a relationship between variables. To design a controlled experiment, you need:
When designing the experiment, you decide:
Experimental design is essential to the internal and external validity of your experiment.
You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause, while a dependent variable is the effect.
In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:
Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design.
Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).
Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).
You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results.
Discrete and continuous variables are two types of quantitative variables:
A confounding variable, also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.
A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.
In your research design, it’s important to identify potential confounding variables and plan how you will reduce their impact.
The research methods you use depend on the type of data you need to answer your research question.
In mixed methods research, you use both qualitative and quantitative data collection and analysis methods to answer your research question.
There are eight threats to internal validity: history, maturation, instrumentation, testing, selection bias, regression to the mean, social interaction and attrition.
Longitudinal studies and cross-sectional studies are two different types of research design. In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.
Longitudinal study | Cross-sectional study |
---|---|
Repeated observations | Observations at a single point in time |
Observes the same group multiple times | Observes different groups (a “cross-section”) in the population |
Follows changes in participants over time | Provides snapshot of society at a given point |
Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.
The 1970 British Cohort Study, which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study.
Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.
Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.
Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.
Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study.
The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.
The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).
There are seven threats to external validity: selection bias, history, experimenter effect, Hawthorne effect, testing effect, aptitude-treatment and situation effect.
Samples are used to make inferences about populations. Samples are easier to collect data from because they are practical, cost-effective, convenient and manageable.
Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.
A statistic refers to measures about the sample, while a parameter refers to measures about the population.
A sampling error is the difference between a population parameter and a sample statistic.
Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.
Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.
Some common types of sampling bias include self-selection, non-response, undercoverage, survivorship, pre-screening or advertising, and healthy user bias.
Using careful research design and sampling procedures can help you avoid sampling bias. Oversampling can be used to correct undercoverage bias.
Probability sampling means that every member of the target population has a known chance of being included in the sample.
Probability sampling methods include simple random sampling, systematic sampling, stratified sampling, and cluster sampling.
In non-probability sampling, the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.
Common non-probability sampling methods include convenience sampling, voluntary response sampling, purposive sampling, snowball sampling, and quota sampling.
Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.
You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment.
No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!
Yes, but including more than one of either type requires multiple research questions.
For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.
You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable.
To ensure the internal validity of an experiment, you should only change one independent variable at a time.
To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables, or even find a causal relationship where none exists.
A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause, while the dependent variable is the supposed effect. A confounding variable is a third variable that influences both the independent and dependent variables.
Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.
There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.
In restriction, you restrict your sample by only including certain subjects that have the same values of potential confounding variables.
In matching, you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable.
In statistical control, you include potential confounders as variables in your regression.
In randomization, you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.
Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.
When conducting research, collecting original data has significant advantages:
However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.
Operationalization means turning abstract conceptual ideas into measurable observations.
For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.
Before collecting data, it’s important to consider how you will operationalize the variables that you want to measure.
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses, by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.
There are five common approaches to qualitative research:
There are various approaches to qualitative data analysis, but they all share five steps in common:
The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis, thematic analysis, and discourse analysis.
In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).
The process of turning abstract concepts into measurable variables and indicators is called operationalization.
A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.
To use a Likert scale in a survey, you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.
Individual Likert-type questions are generally considered ordinal data, because the items have clear rank order, but don’t have an even distribution.
Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.
The type of data determines what statistical tests you should use to analyze your data.
An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.
A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.
However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).
For strong internal validity, it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.
Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment.
Blinding is important to reduce bias and ensure a study’s internal validity.
If participants know whether they are in a control or treatment group, they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.
A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.
Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment.
Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity as they can use real-world interventions instead of artificial laboratory settings.
Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population. Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.
The American Community Survey is an example of simple random sampling. In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.
If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity. However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,
If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.
Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.
The clusters should ideally each be mini-representations of the population as a whole.
There are three types of cluster sampling: single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.
Cluster sampling is more time- and cost-efficient than other probability sampling methods, particularly when it comes to large samples spread across a wide geographical area.
However, it provides less statistical certainty than other methods, such as simple random sampling, because it is difficult to ensure that your clusters properly represent the population as a whole.
In stratified sampling, researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment, etc).
Once divided, each subgroup is randomly sampled using another probability sampling method.
You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.
Using stratified sampling will allow you to obtain more precise (with lower variance) statistical estimates of whatever you are trying to measure.
For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.
Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.
For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.
Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling.
There are three key steps in systematic sampling:
A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.
A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.
If something is a mediating variable:
Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.
Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.
A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.
Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity.
If you don’t control relevant extraneous variables, they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable.
“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.
Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs. That way, you can isolate the control variable’s effects from the relationship between the variables of interest.
In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.
Random selection, or random sampling, is a way of selecting members of a population for your study’s sample.
In contrast, random assignment is a way of sorting the sample into control and experimental groups.
Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.
To implement random assignment, assign a unique number to every member of your study’s sample.
Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.
Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.
In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.
In a between-subjects design, every participant experiences only one condition, and researchers assess group differences between participants in various conditions.
In a within-subjects design, each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.
The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.
Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.
While a between-subjects design has fewer threats to internal validity, it also requires more participants for high statistical power than a within-subjects design.
Advantages:
Disadvantages:
Within-subjects designs have many potential threats to internal validity, but they are also very statistically powerful.
Advantages:
Disadvantages:
In a factorial design, multiple independent variables are tested.
If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.
An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.
A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.
There are 4 main types of extraneous variables:
In a controlled experiment, all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:
Depending on your study topic, there are various other methods of controlling variables.
The difference between explanatory and response variables is simple:
The term “explanatory variable” is sometimes preferred over “independent variable” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.
Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.
On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.
Random and systematic error are two types of measurement error.
Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).
Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).
Systematic error is generally a bigger problem in research.
With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample, the errors in different directions will cancel each other out.
Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions (Type I and II errors) about the relationship between the variables you’re studying.
Random error is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables.
You can avoid systematic error through careful design of your sampling, data collection, and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment; and apply masking (blinding) where possible.
A correlation reflects the strength and/or direction of the association between two or more variables.
A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research.
A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.
Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions. The Pearson product-moment correlation coefficient (Pearson’s r) is commonly used to assess a linear relationship between two quantitative variables.
A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.
Controlled experiments establish causality, whereas correlational studies only show associations between variables.
In general, correlational research is high in external validity while experimental research is high in internal validity.
Correlation describes an association between variables: when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.
Causation means that changes in one variable brings about changes in the other; there is a cause-and-effect relationship between variables. The two variables are correlated with each other, and there’s also a causal link between them.
The third variable and directionality problems are two main reasons why correlation isn’t causation.
The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.
The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.
A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.
Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.
Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.
You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.
Questionnaires can be self-administered or researcher-administered.
Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.
Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.
A research design is a strategy for answering your research question. It defines your overall approach and determines how you will collect and analyze data.
The priorities of a research design can vary depending on the field, but you usually have to specify:
A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources. This allows you to draw valid, trustworthy conclusions.
Quantitative research designs can be divided into two main categories:
Qualitative research designs tend to be more flexible. Common types of qualitative design include case study, ethnography, and grounded theory designs.
These are the assumptions your data must meet if you want to use Pearson’s r:
Correlation coefficients always range between -1 and 1.
The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.
The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.
No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.
To find the slope of the line, you’ll need to perform a regression analysis.
In multistage sampling, or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.
This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.
Triangulation means using multiple methods to collect and analyze data on the same subject. By combining different types or sources of data, you can strengthen the validity of your findings.
These are four of the most common mixed methods designs:
Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.
But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples.
In multistage sampling, you can use probability or non-probability sampling methods.
For a probability sample, you have to probability sampling at every stage. You can mix it up by using simple random sampling, systematic sampling, or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.
Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.
Scientists and researchers must always adhere to a certain code of conduct when collecting data from others.
These considerations protect the rights of research participants, enhance research validity, and maintain scientific integrity.
Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.
Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations.
You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.
You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.
Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.
These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.
Want to contact us directly? No problem. We are always here for you.
Our team helps students graduate by offering:
Scribbr specializes in editing study-related documents. We proofread:
The Scribbr Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker, namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases.
The Scribbr Citation Generator currently supports the following citation styles, and we’re working hard on supporting more styles in the future.
Scribbr uses industry-standard citation styles from the Citation Styles Language project.