Menu
For free
Registration
home  /  Our children/ Experimental theory, experimental planning. Formal planning and assessment of validity as conditions for establishing an experimental effect Requirements for experimental design

Experimental theory, experimental planning. Formal planning and assessment of validity as conditions for establishing an experimental effect Requirements for experimental design

4.7. Experimental plans

Experimental design is a tactics of experimental research, embodied in a specific system of experimental planning operations. The main criteria for classifying plans are:

Composition of participants (individual or group);

Number of independent variables and their levels;

Types of scales for presenting independent variables;

Method of collecting experimental data;

Place and conditions of the experiment;

Features of the organization of experimental influence and method of control.

Plans for groups of subjects and for one subject. All experimental plans can be divided according to the composition of participants into plans for groups of subjects and plans for one subject.

Experiments with group of subjects have the following advantages: the ability to generalize the results of the experiment to the population; the possibility of using intergroup comparison schemes; saving time; application of statistical analysis methods. The disadvantages of this type of experimental designs include: the influence of individual differences between people on the results of the experiment; the problem of representativeness of the experimental sample; the problem of equivalence of groups of subjects.

Experiments with one subject- this is a special case of “plans with a small N". J. Goodwin points out the following reasons for using such plans: the need for individual validity, since in experiments with a large N A problem arises when the generalized data does not characterize any subject. An experiment with one subject is also carried out in unique cases when, for a number of reasons, it is impossible to attract many participants. In these cases, the purpose of the experiment is to analyze unique phenomena and individual characteristics.

An experiment with small N, according to D. Martin, has the following advantages: the absence of complex statistical calculations, ease of interpretation of results, the ability to study unique cases, the involvement of one or two participants, and ample opportunities for manipulating independent variables. It also has some disadvantages, in particular the complexity of control procedures, difficulty in generalizing results; relative time inefficiency.

Let's consider plans for one subject.

Planning time series. The main indicator of the influence of the independent variable on the dependent variable when implementing such a plan is the change in the nature of the subject’s responses over time. The simplest strategy: scheme A– B. The subject initially performs the activity in conditions A, and then in conditions B. To control the “placebo effect”, the following scheme is used: A – B – A.(“The placebo effect” is the reactions of subjects to “empty” influences that correspond to reactions to real influences.) In this case, the subject should not know in advance which of the conditions is “empty” and which is real. However, these schemes do not take into account the interaction of influences, therefore, when planning time series, as a rule, regular alternation schemes are used (A - B – A– B), positional adjustment (A – B – B– A) or random alternation. The use of “longer” time series increases the possibility of detecting an effect, but leads to a number of negative consequences - fatigue of the subject, decreased control over other additional variables, etc.

Alternative Impact Plan is a development of the time series plan. Its specificity lies in the fact that the effects A And IN are randomly distributed over time and presented to the subject separately. The effects of each intervention are then compared.

Reversible plan used to study two alternative forms of behavior. Initially, a baseline level of manifestation of both forms of behavior is recorded. Then a complex effect is presented, consisting of a specific component for the first form of behavior and an additional one for the second. After a certain time, the combination of influences is modified. The effect of two complex interventions is assessed.

Criteria increasing plan often used in educational psychology. Its essence is that a change in the subject’s behavior is recorded in response to an increase in exposure. In this case, the next impact is presented only after the subject reaches the specified criterion level.

When conducting experiments with one subject, it should be taken into account that the main artifacts are practically unavoidable. In addition, in this case, like no other, the influence of the experimenter’s attitudes and the relationships that develop between him and the subject are manifested.

R. Gottsdanker suggests distinguishing qualitative and quantitative experimental designs. IN quality In plans, the independent variable is presented on a nominative scale, i.e., two or more qualitatively different conditions are used in the experiment.

IN quantitative In experimental designs, the levels of the independent variable are presented on interval, rank or proportional scales, i.e., the experiment uses the levels of expression of a particular condition.

It is possible that in a factorial experiment one variable will be presented in quantitative form and the other in qualitative form. In this case, the plan will be combined.

Within-group and between-group experimental designs. T.V. Kornilova defines two types of experimental plans according to the criterion of the number of groups and experimental conditions: intragroup and intergroup. TO intragroup refers to designs in which the influence of variations in the independent variable and the measurement of the experimental effect occur in the same group. IN intergroup plans, the influence of variants of the independent variable is carried out in different experimental groups.

The advantages of the within-group design are: a smaller number of participants, the elimination of individual differences factors, a reduction in the total time of the experiment, and the ability to prove the statistical significance of the experimental effect. Disadvantages include the non-constancy of conditions and the manifestation of the “sequence effect”.

The advantages of the intergroup design are: the absence of a “sequence effect”, the possibility of obtaining more data, reducing the time of participation in the experiment for each subject, reducing the effect of dropout of experiment participants. The main disadvantage of the between-groups design is the non-equivalence of the groups.

Single independent variable and factorial designs. According to the criterion of the number of experimental influences, D. Martin proposes to distinguish between plans with one independent variable, factorial plans and plans with a series of experiments. In the plans with one independent variable the experimenter manipulates one independent variable, which can have an unlimited number of manifestations. IN factorial plans (for details about them, see p. 120), the experimenter manipulates two or more independent variables, explores all possible options for the interaction of their different levels.

Plans with a series of experiments are carried out to gradually eliminate competing hypotheses. At the end of the series, the experimenter comes to verify one hypothesis.

Pre-experimental, quasi-experimental, and true experimental designs. D. Campbell proposed dividing all experimental plans for groups of subjects into the following groups: pre-experimental, quasi-experimental and true experimental plans. This division is based on the proximity of a real experiment to an ideal one. The fewer artifacts a particular design provokes and the stricter the control of additional variables, the closer the experiment is to ideal. Pre-experimental plans least of all take into account the requirements for an ideal experiment. V.N. Druzhinin points out that they can only serve as illustrations; in the practice of scientific research they should be avoided if possible. Quasi-experimental designs are an attempt to take into account the realities of life when conducting empirical research; they are specifically created to deviate from the designs of true experiments. The researcher must be aware of the sources of artifacts - external additional variables that he cannot control. A quasi-experimental design is used when a better design cannot be used.

Systematic features of pre-experimental, quasi-experimental and true experimental designs are given in the table below.

When describing experimental plans, we will use the symbolization proposed by D. Campbell: R– randomization; X– experimental influence; O– testing.

TO pre-experimental designs include: 1) single case study; 2) plan with preliminary and final testing of one group; 3) comparison of statistical groups.

At single case study One group is tested once after the experimental intervention. Schematically, this plan can be written as:

Control of external variables and independent variable is completely absent. In such an experiment there is no material for comparison. The results can only be compared with everyday ideas about reality; they do not carry scientific information.

Plan with preliminary and final testing of one group often used in sociological, socio-psychological and pedagogical research. It can be written as:

This design does not have a control group, so it cannot be argued that changes in the dependent variable (the difference between O1 and O2), recorded during testing, are caused precisely by changes in the independent variable. Between the initial and final testing, other “background” events may occur that affect the subjects along with the independent variable. This design also does not control for the natural progression effect and the testing effect.

Comparison of statistical groups it would be more accurate to call it a two-non-equivalent group design with post-exposure testing. It can be written like this:

This design allows for the testing effect to be taken into account by introducing a control group to control for a number of external variables. However, with its help it is impossible to take into account the effect of natural development, since there is no material to compare the state of the subjects at the moment with their initial state (preliminary testing was not carried out). To compare the results of the control and experimental groups, Student's t-test is used. However, it should be taken into account that differences in test results may not be due to experimental effects, but to differences in group composition.

Quasi-experimental designs are a kind of compromise between reality and the strict framework of true experiments. There are the following types of quasi-experimental designs in psychological research: 1) experimental plans for non-equivalent groups; 2) designs with pre-test and post-test of different randomized groups; 3) plans of discrete time series.

Plan experiment for non-equivalent groups is aimed at establishing a cause-and-effect relationship between variables, but it does not have a procedure for equalizing groups (randomization). This plan can be represented by the following diagram:

In this case, two real groups are involved in conducting the experiment. Both groups are tested. One group is then exposed to the experimental treatment while the other is not. Both groups are then retested. The results of the first and second testing of both groups are compared; Student’s t-test and analysis of variance are used for comparison. Difference O2 and O4 indicates natural development and background exposure. To identify the effect of the independent variable, it is necessary to compare 6(O1 O2) and 6(O3 O4), i.e., the magnitude of the shifts in the indicators. The significance of the difference in the increases in indicators will indicate the influence of the independent variable on the dependent one. This design is similar to the design of a true two-group experiment with pre- and post-exposure testing (see page 118). The main source of artifacts is differences in group composition.

Plan with pre- and post-testing of different randomized groups differs from a true experimental design in that one group is pretested and an equivalent group is exposed to the posttest:

The main disadvantage of this quasi-experimental design is the inability to control for background effects—the influence of events that occur alongside the experimental treatment between the first and second testing.

Plans discrete time series are divided into several types depending on the number of groups (one or several), as well as depending on the number of experimental effects (single or series of effects).

The discrete time series design for one group of subjects consists of initially determining the initial level of the dependent variable on a group of subjects using a series of sequential measurements. Then an experimental effect is applied and a series of similar measurements are carried out. The levels of the dependent variable before and after the intervention are compared. The outline of this plan:

The main disadvantage of a discrete time series design is that it does not allow one to separate the effect of the independent variable from the effect of background events that occur during the course of the study.

A modification of this design is a time-series quasi-experiment in which exposure before measurement is alternated with no exposure before measurement. His scheme is as follows:

ХO1 – O2ХO3 – O4 ХO5

Alternation can be regular or random. This option is only suitable if the effect is reversible. When processing the data obtained in the experiment, the series is divided into two sequences and the results of measurements where there was an impact are compared with the results of measurements where there was no impact. To compare data, Student's t-test with the number of degrees of freedom is used n– 2, where n– the number of situations of the same type.

Time series plans are often implemented in practice. However, when using them, the so-called “Hawthorne effect” is often observed. It was first discovered by American scientists in 1939, when they conducted research at the Hawthorne plant in Chicago. It was assumed that changing the labor organization system would increase productivity. However, during the experiment, any changes in the organization of work led to an increase in productivity. As a result, it turned out that participation in the experiment itself increased motivation to work. The subjects realized that they were personally interested in them and began to work more productively. To control for this effect, a control group must be used.

The time series design for two non-equivalent groups, one of which receives no intervention, looks like this:

O1O2O3O4O5O6O7O8O9O10

O1O2O3O4O5O6O7O8O9O10

This plan allows you to control the “background” effect. It is usually used by researchers when studying real groups in educational institutions, clinics, and production.

Another specific design that is often used in psychology is called an experiment. ex-post-facto. It is often used in sociology, pedagogy, as well as neuropsychology and clinical psychology. The strategy for applying this plan is as follows. The experimenter himself does not influence the subjects. The influence is some real event from their life. The experimental group consists of “test subjects” who were exposed to the intervention, and the control group consists of people who did not experience it. In this case, the groups are, if possible, equalized at the time of their state before the impact. Then the dependent variable is tested among representatives of the experimental and control groups. The data obtained as a result of testing are compared and a conclusion is drawn about the impact of the impact on the further behavior of the subjects. Thus the plan ex-post-facto simulates an experimental design for two groups with their equalization and testing after exposure. His scheme is as follows:

If group equivalence can be achieved, then the design becomes a true experimental design. It is implemented in many modern studies. For example, in the study of post-traumatic stress, when people who have suffered the effects of a natural or man-made disaster, or combatants, are tested for the presence of PTSD, their results are compared with the results of a control group, which makes it possible to identify the mechanisms of such reactions. In neuropsychology, brain injuries, lesions of certain structures, considered as “experimental exposure,” provide a unique opportunity to identify the localization of mental functions.

True Experiment Plans for one independent variable differ from others as follows:

1) using strategies to create equivalent groups (randomization);

2) the presence of at least one experimental and one control group;

3) final testing and comparison of the results of groups that received and did not receive the intervention.

Let's take a closer look at some experimental designs for one independent variable.

Two randomized group design with post-exposure testing. His diagram looks like this:

This plan is used if it is not possible or necessary to conduct preliminary testing. If the experimental and control groups are equal, this design is the best because it allows you to control most sources of artifacts. The absence of pretesting excludes both the interaction effect of the testing procedure and the experimental task, as well as the testing effect itself. The plan allows you to control the influence of group composition, spontaneous attrition, the influence of background and natural development, and the interaction of group composition with other factors.

In the example considered, one level of influence of the independent variable was used. If it has several levels, then the number of experimental groups increases to the number of levels of the independent variable.

Two randomized group design with pretest and posttest. The outline of the plan looks like this:

R O1 X O2

This design is used if there is doubt about the results of randomization. The main source of artifacts is the interaction of testing and experimental manipulation. In reality, we also have to deal with the effect of non-simultaneous testing. Therefore, it is considered best to test members of the experimental and control groups in random order. Presentation-non-presentation of the experimental intervention is also best done in random order. D. Campbell notes the need to control “intra-group events.” This experimental design controls well for the background effect and the natural progression effect.

When processing data, parametric criteria are usually used t And F(for data on an interval scale). Three t values ​​are calculated: 1) between O1 and O2; 2) between O3 and O4; 3) between O2 And O4. The hypothesis about the significance of the influence of the independent variable on the dependent variable can be accepted if two conditions are met: 1) differences between O1 And O2 significant, but between O3 And O4 insignificant and 2) differences between O2 And O4 significant. Sometimes it is more convenient to compare not absolute values, but the magnitude of the increase in indicators b(1 2) and b(3 4). These values ​​are also compared using Student's t test. If the differences are significant, the experimental hypothesis about the influence of the independent variable on the dependent variable is accepted.

Solomon's Plan is a combination of the two previous plans. To implement it, two experimental (E) and two control (C) groups are needed. His diagram looks like this:

This design can control for the pretest interaction effect and the experimental effect. The effect of experimental influence is revealed by comparing the indicators: O1 and O2; O2 and O4; O5 and O6; O5 and O3. Comparison of O6, O1 and O3 allows us to identify the influence of the factor of natural development and background influences on the dependent variable.

Now consider a design for one independent variable and several groups.

Design for three randomized groups and three levels of the independent variable used in cases where it is necessary to identify quantitative relationships between independent and dependent variables. His diagram looks like this:

In this design, each group is presented with only one level of the independent variable. If necessary, you can increase the number of experimental groups in accordance with the number of levels of the independent variable. All of the above statistical methods can be used to process the data obtained using such an experimental design.

Factorial experimental designs used to test complex hypotheses about relationships between variables. In a factorial experiment, as a rule, two types of hypotheses are tested: 1) hypotheses about the separate influence of each of the independent variables; 2) hypotheses about the interaction of variables. A factorial design involves all levels of independent variables being combined with each other. The number of experimental groups is equal to the number of combinations.

Factorial design for two independent variables and two levels (2 x 2). This is the simplest of factorial designs. His diagram looks like this.



This design reveals the effect of two independent variables on one dependent variable. The experimenter combines possible variables and levels. Sometimes four independent randomized experimental groups are used. To process the results, Fisher's analysis of variance is used.

There are more complex versions of the factorial design: 3 x 2 and 3 x 3, etc. The addition of each level of the independent variable increases the number of experimental groups.

"Latin Square". It is a simplification of a complete design for three independent variables having two or more levels. The Latin square principle is that two levels of different variables occur only once in an experimental design. This significantly reduces the number of groups and the experimental sample as a whole.

For example, for three independent variables (L, M, N) with three levels each (1, 2, 3 and N(A, B, C)) the plan using the “Latin square” method will look like this.

In this case, the level of the third independent variable (A, B, C) occurs once in each row and each column. By combining results across rows, columns, and levels, it is possible to identify the influence of each of the independent variables on the dependent variable, as well as the degree of pairwise interaction between the variables. Application of Latin letters A, B, WITH It is traditional to designate the levels of the third variable, which is why the method is called “Latin square”.

"Greco-Latin square". This design is used when the influence of four independent variables needs to be examined. It is constructed on the basis of a Latin square for three variables, with a Greek letter attached to each Latin group of the design, indicating the levels of the fourth variable. A design for a design with four independent variables, each with three levels, would look like this:

To process the data obtained in the “Greco-Latin square” design, the Fisher analysis of variance method is used.

The main problem that factorial designs can solve is determining the interaction of two or more variables. This problem cannot be solved using several conventional experiments with one independent variable. In a factorial design, instead of trying to “cleanse” the experimental situation of additional variables (with a threat to external validity), the experimenter brings it closer to reality by introducing some additional variables into the category of independent ones. At the same time, the analysis of connections between the studied characteristics allows us to identify hidden structural factors on which the parameters of the measured variable depend.

Planning an experiment is one of the most important stages in organizing psychological research, at which the researcher tries to construct the most optimal model (that is, plan) of the experiment for implementation in practice. A well-designed research design, plan, allows you to achieve optimal values ​​of validity, reliability and accuracy in the study, to provide for nuances that are difficult to follow during everyday “spontaneous experimentation.” Often, in order to adjust the plan, experimenters conduct a so-called pilot, or trial, study, which can be considered as a “draft” of a future scientific experiment.
An experimental design is created to answer basic questions about:

· the number of independent variables that are used in the experiment (one or several?);

· the number of levels of the independent variable (does the independent variable change or remain constant?);

methods for controlling additional or disturbing variables (which ones are necessary and advisable to use?):

o direct control method (direct exclusion of a known additional variable),
o leveling method (take into account a known additional variable when it is impossible to exclude it),
o randomization method (random selection of groups in case of unknown additional variable).
One of the most important questions that an experimental design must answer is to determine in what sequence the changes in the stimuli under consideration (independent variables) affecting the dependent variable should occur. The sequence of presentation of stimuli is a very important issue that directly relates to the validity of the study: for example, if a person is constantly presented with the same stimulus, he may become less susceptible to it.
Types of plans:
1. Simple (one-factor) plans - involve studying the influence of only one independent variable on the dependent variable. The advantage of such designs is their effectiveness in establishing the influence of the independent variable, as well as the ease of analysis and interpretation of the results. The disadvantage is the inability to infer a functional relationship between the independent and dependent variables.
- Experiments with reproducible conditions. Compared to experiments involving two independent groups, such designs require fewer participants. The design does not imply the presence of different groups (for example, experimental and control). The purpose of such experiments is to establish the effect of one factor on one variable.
- Experiments involving two independent groups (experimental and control) - experiments in which only the experimental group is exposed to experimental effects, while the control group continues to do what it usually does. The purpose is to test the effect of one independent variable.
2. Complex designs are used for experiments that examine either the effects of several independent variables (factorial designs) or the sequential effects of different levels of a single independent variable (multilevel designs).
- Plans for multi-level experiments. When experiments use one independent variable, a situation where only two of its values ​​are studied is considered the exception rather than the rule. Most univariate studies use three or more values ​​of the independent variable—such designs are often called univariate multilevel designs. Such designs can be used both to study nonlinear effects (that is, cases where the independent variable takes on more than two values) and to test alternative hypotheses. The advantage is the ability to determine the type of functional relationship between the independent and dependent variables. The disadvantage is that it takes a lot of time and also requires more participants.
- Factorial designs involve the use of more than one independent variable. There can be any number of such variables or factors, but usually they are limited to using two, three, or less often four. Factorial designs are described using a numbering system showing the number of independent variables and the number of values ​​(levels) each variable takes. For example, a 2x3 factorial design has two independent variables (factors), the first of which takes two values ​​(“2”), and the second takes three values ​​(“3”).
3. Quasi-experimental designs - plans for experiments in which, due to incomplete control of variables, it is impossible to draw conclusions about the existence of a cause-and-effect relationship. These plans are often used in applied psychology.
- Ex post facto plans. - studies in which data are collected and analyzed after the event has already occurred, many classify them as quasi-experimental. The essence of the study is that the experimenter himself does not influence the subjects: the influence is some real event from their lives. Study design simulates a rigorous experimental design with equalization or randomization of groups and post-exposure testing.
- Small-N experimental designs are also called “single-subject designs” because the behavior of each subject is considered individually. One of the main reasons for using small-N experiments is considered to be the inability in some cases to apply results obtained from generalizations to large groups of people to any individual participant (thus leading to a violation of individual validity). Ebbinghaus's introspective studies can be classified as small-N experiments (only the subject he studied was himself). A single-subject plan must meet at least three conditions:
1. The target behavior must be precisely defined in terms of events that are easy to record.
2. It is necessary to establish a baseline level of response.
3. It is necessary to influence the subject and record his behavior.
4. Correlation research designs - research conducted to confirm or refute a hypothesis about a statistical relationship (correlation) between several (two or more) variables. It differs from quasi-experimental in that it lacks a controlled influence on the object of research. In a correlation study, the scientist hypothesizes the existence of a statistical connection between several mental properties of an individual or between certain external levels and mental states. Subjects must be in equivalent unaltered conditions. Types of correlation studies:



Comparison of two groups

· One-dimensional study

· Correlation study of pairwise equivalent groups

· Multivariate correlation study

· Structural correlation research

· Longitudinal correlation study*

Planning includes two stages.


1- Determination of sample composition.
2- Determination of sample size.
3- Determining the sampling method.


Formal experimental planning
1. Meaningful planning of the experiment:
- Determination of a number of theoretical and experimental provisions that form the theoretical basis of the study.
- Formulation of theoretical and experimental research hypotheses.
- Selection of the required experimental method.
- Solving the issue of sampling subjects: Determining the composition, volume and method of sampling.
2. Formal planning of the experiment:
- Achieving the ability to compare results.
- Achieving the possibility of discussing the data obtained.
- Ensuring cost-effective research.
Formal design involves selecting an experimental design, or plan for varying the conditions of the independent variable (IV), and determining the minimum effect size for the expected outcome of the IV. The data collection plan is also the plan according to which the salary is measured. The main goal of formal planning is to eliminate the maximum possible number of reasons for distorting the results.
Problems of formal planning of the researcher.
- ensure the validity of the experiment
- provide a condition for making a decision about the experimental effect, or the effect of the NP.
- application of data processing schemes that are adequate to the metric, use of, for example, scales and method of data collection.
In a narrow sense, experimental planning includes 2 points related to taking into account the subsequent statistical decision.
1. How will the experimental effect be assessed? Between NP and ZP
2. establishing a minimum effect sufficient to make a judgment about the differences obtained in experiments and control conditions or the observed relationship between measurements of NP and GP (establishing a minimum effect includes determining the probability of errors of the first (alpha) and second (beta - level) types).
There are experimental effects that are determined only using statistical methods, and there are those in which the change in the salary is so noticeable that no statistics are needed.
The magnitude of the minimum effect is related to the amount of experimental data, i.e. with the number of sample values ​​of salary indicators. The psychological size of the sample (the number of subjects or the number of experiments) can significantly reduce the magnitude of the effect, enough to make a decision on the action of the NP, but this is still related to the content of planning. (control of time factors, sample representation, etc.)
Formal design for testing a psychological hypothesis is possible in psychological research cases where the traditional approach is adopted: variables are presented and manipulated independently of each other.
I Problem solving contains an experimental plan presented at the stage of specifying both hypotheses and variables, i.e. so that the specificity of the psychological reality under study is not lost: the psychological explanation specified in the hypothetical constructs and the formulation of the cause-and-effect relationship is meaningfully correlated with the type of establishment of the empirical relationship and the conditions for its identification, including methods for setting the conditions of NP and the choice of methods for fixing the PP indicator. This is the first stage of experiment planning.
II Determination of an adequate data collection scheme, the number of necessary samples to control factors that threaten the validity of the experiment, etc. the psychologist accepts the conventions of a number of provisions.
As stages of formal planning, decisions are made on the magnitude of the minimum effect of X-impacts or on the magnitude of the shift of the ZP, which was measured at different levels of the NP, which is accepted as sufficient or reasonable from the point of view. the possibility of rejecting the null hypothesis, as well as the levels of acceptable errors when testing a statistical hypothesis.

Meaningful experiment planning
Planning includes two stages:
1. Meaningful planning of the experiment:
- Determination of a number of theoretical and experimental provisions that form the theoretical basis of the study. Statement of the problem or definition of the topic. Any research begins with defining a topic (it limits what we will research). The study is carried out in three cases:
1-testing the hypothesis about the existence of the phenomenon;
2-testing the hypothesis about the existence of a connection between phenomena;
3-testing the hypothesis about the causal dependence of phenomenon A on phenomenon B.
The primary formulation of the problem is to formulate a hypothesis. A psychological hypothesis, or experimental, is a hypothesis about a mental phenomenon, the testing tool for which is psychological research.
- Formulation of theoretical and experimental research hypotheses. The stage of clarifying the hypothesis and identifying variables. Determination of the experimental hypothesis.
- Selection of the required experimental method.
- Selection of experimental instrument and experimental conditions (answers the question “how to organize a study?”):
Allows you to control the independent variable. An independent variable is, in a scientific experiment, a variable that is intentionally manipulated or selected by the experimenter to determine its effect on the dependent variable.
Allows recording of the dependent variable. Dependent variable - in a scientific experiment, a measured variable, changes in which are associated with changes in the independent variable
- Solving the issue of sampling subjects:
- Determination of sample composition.
- Determination of sample size.
- Determining the sampling method.
- Randomization (random selection). Used to create simple random samples, based on the assumption that each member of the population has an equal chance of being included in the sample. For example, to make a random sample of 100 university students, you can put pieces of paper with the names of all university students in a hat, and then take 100 pieces of paper out of it - this will be a random selection.
- Pairwise selection is a strategy for constructing sampling groups, in which groups of subjects are made up of subjects who are equivalent in terms of secondary parameters that are significant for the experiment. This strategy is effective for experiments using experimental and control groups, with the best option being the involvement of twin pairs (mono- and dizygotic), as it allows you to create.
- Stratometric selection. Stratometric selection - randomization with the allocation of strata (or clusters). With this method of sampling, the general population is divided into groups (strata) with certain characteristics (gender, age, political preferences, education, income level, etc.), and subjects with the corresponding characteristics are selected.
- Approximate modeling. Approximate modeling - drawing limited samples and generalizing conclusions about this sample to a wider population. For example, with the participation of 2nd year university students in the study, the data of this study applies to “people aged 17 to 21 years”. The admissibility of such generalizations is extremely limited.
- Attracting real groups
2. Formal planning of the experiment:
- Achieving the ability to compare results.
- Achieving the possibility of discussing the data obtained.
- Ensuring cost-effective research.
The main goal of formal planning is to eliminate the maximum possible number of reasons for distorting the results.

Factorial design of experiment
Factorial experiments are used when it is necessary to test complex hypotheses about the relationships between variables. The general form of such a hypothesis is: “If A1, A2,..., An, then B.” Such hypotheses are called complex, combined, etc. In this case, there can be various relationships between independent variables: conjunction, disjunction, linear independence, additive or multiplicative, etc. Factorial experiments are a special case of multivariate research, during which they try to establish relationships between several independent and several dependent variables. In a factorial experiment, as a rule, two types of hypotheses are tested simultaneously:
1) hypotheses about the separate influence of each of the independent variables;
2) hypotheses about the interaction of variables, namely, how the presence of one of the independent variables affects the effect on the other.
A factorial experiment is based on a factorial design. Factorial design of an experiment involves combining all levels of independent variables with each other. The number of experimental groups is equal to the number of combinations of levels of all independent variables.
The most commonly used factorial designs are for two independent variables and two levels of the 2x2 type. To draw up a plan, the principle of balancing is applied. A 2x2 design is used to identify the effect of two independent variables on one dependent variable. The experimenter manipulates possible combinations of variables and levels. The data is given in a simple table.
2nd variable 1st variable
Yes No
Yes 1 2
No 3 4
To process the results, Fisher's analysis of variance is used.
Other versions of the factorial design, namely 3x2 or 3x3, are also rarely used. The 3x2 design is used in cases where it is necessary to establish the type of dependence of one dependent variable on one independent variable, and one of the independent variables is represented by a dichotomous parameter. An example of such a plan is an experiment to identify the impact of external observation on the success of solving intellectual problems. The first independent variable varies simply: there is an observer, there is no observer. The second independent variable is task difficulty levels. In this case we get a 3x2 plan.
1st variable 2nd variable
Easy Medium Difficult
There is an observer 1 2 3
No observer 4 5 6
The 3x3 design option is used if both independent variables have several levels and it is possible to identify the types of relationships between the dependent variable and the independent ones. This plan allows us to identify the influence of reinforcement on the success of completing tasks of varying difficulty.
Task difficulty level Stimulation intensity
Low Medium High
Low 1 2 3
Average 4 5 6
High 7 8 9
In general, the design for two independent variables looks like N x M. The applicability of such plans is limited only by the need to recruit a large number of randomized groups. The amount of experimental work increases excessively with the addition of each level of any independent variable.
In the case when we are interested in the success of completing an experimental series of tasks, which depends not only on general stimulation, which is carried out in the form of punishment - electric shock, but also on the ratio of reward and punishment, we use a 3x3x3 plan.
L1 L2 L3
M1 A1 B2 C3
M2 B2 C3 A1
m3 C3 A1 B2
2 signs of a multi-level experiment:
1. NP has more than 2 levels
2. the order of presentation of these three or more conditions of the same NP is controlled by a special scheme, which implies equalization of the ordinal position of each level in the general sequence of conditions
These multi-level experiments are contrasted with bivalent (here there are 2 levels of NP, experimental and control may differ in quality and quantity)
Quantitative assessment is an assessment on scales of order, ratios, intervals.
Classification of NP levels is a qualitative assessment, may be based on one or more criteria.
It is not the number of NPs that determines the transition to a quantitative experiment, but the possibility of measuring at least one of the NPs as quantitative.
Multilevel experiments are often built on factorial designs, since the second variable is the “order of levels” of the first NP
There are 2 most popular schemes:
1. complete equalization according to the lat square scheme
2. adjustment according to the balanced lat square scheme
Both of these schemes are variants of experimental plans in which all levels of the first NP are presented to each subject, but the second NP is formed by dividing the subjects into groups, which are presented with one of the possible sequences of the level of the first NP
Let's consider the possible results of the simplest 2x2 factorial experiment from the standpoint of interactions of variables. To do this, we need to present the results of the experiments on a graph, where the values ​​of the first independent variable are plotted along the abscissa axis, and the values ​​of the dependent variable are plotted along the ordinate axis. Each of the two straight lines connecting the values ​​of the dependent variable at different values ​​of the first independent variable (A) characterizes one of the levels of the second independent variable (B). For simplicity, let us apply the results of a correlation study rather than an experimental one. Let us agree that we examined the dependence of a child’s status in a group on his state of health and level of intelligence. Let's consider options for possible relationships between variables.
First option: the lines are parallel - there is no interaction between the variables.
Sick children have a lower status than healthy children, regardless of their level of intelligence. Intellectuals always have a higher status (regardless of health).
The second option: physical health with a high level of intelligence increases the chance of receiving a higher status in the group (Figure 5.2).
In this case, the effect of divergent interaction between two independent variables is obtained. The second variable enhances the influence of the first on the dependent variable.
Third option: convergent interaction - physical health reduces the chance of an intellectual to acquire a higher status in the group. The “health” variable reduces the influence of the “intelligence” variable on the dependent variable. There are other cases of this interaction option:
The variables interact in such a way that an increase in the value of the first leads to a decrease in the influence of the second with a change in the sign of the dependence (Fig. 5.3).
Sick children with a high level of intelligence are less likely to receive a high status than sick children with low intelligence, and for healthy children there is a positive relationship between intelligence and status.
It is theoretically possible to imagine that sick children would have a greater chance of achieving high status with a high level of intelligence than their healthy, low-intelligence peers.
The last, fourth, possible variant of the relationships between independent variables observed in research: the case when there is an overlapping interaction between them, presented in the last graph (Fig. 5.4).
The magnitude of the interaction is assessed using analysis of variance, and Student's t-test is used to assess the significance of group differences.
In all considered experimental design options, a balancing method is used: different groups of subjects are placed in different experimental conditions. The procedure for equalizing the composition of groups allows for comparison of results.
However, in many cases it is necessary to design an experiment so that all participants receive all possible exposures to independent variables. Then the counterbalancing technique comes to the rescue.

Psychophysics. Methods for measuring sensitivity thresholds
Psychophysics, a branch of psychology that studies the quantitative relationship between the strength of the stimulus and the magnitude of the resulting sensation using quantitative methods. Founded by G. Fechner in the 2nd half of the 19th century. It seeks answers to the following questions:
1) What level of stimulation is needed to produce a sensation or sensory response?
2) How much must the magnitude of the stimulus change in order for the change to be detected?
4) How does a sensation or sensory response change with a change in the magnitude of the stimulus?
To answer these and other questions, psychophysical methods are used. These methods include: 3 classical methods for determining thresholds, introduced into psychophysics by G. Fechner; numerous psychophysical methods of scaling suprathreshold stimuli, used to obtain measures of the magnitude of sensation, and methods of signal detection theory (SDT), used to obtain measures of “nominal” sensory sensitivity, minimally distorted by the motives and attitudes of the subjects. The so-called classical methods - the method of boundaries, the method of installation and the method of constant stimuli - were first brought together and presented by Fechner in his work "Elements of Psychophysics". They were used to determine absolute and difference thresholds. The absolute threshold is defined as the magnitude of the stimulus that produces 50% of detections. Similarly, the differential threshold is the minimum change in stimulation detected 50% of the time.
Boundary method. When using this method, the observer is presented with either a monotonically increasing (ascending trials) or a monotonically decreasing (descending trials) discrete sequence of stimuli, the magnitude of which changes until the observer’s reaction changes from “yes” to “no” in each individual trial. (in descending trials) or from “no” to “yes” (in ascending trials). The stimulation level corresponding to half the interval at which the response changes is taken as the threshold value for a given test.
Installation method. In contrast to the boundary method, this method allows the observer himself to regulate a continuously changing stimulus in order to equalize it with a given standard. Each test consists of the observer adjusting a variable stimulus from the point of obvious inequality to the point of subjective equality with the standard. Ascending and descending trials alternate along with a randomly varied initial deviation of the variable stimulus from the standard.
Method of constant stimuli. This method requires the observer to present in each individual trial only one stimulus selected from a fixed set of 4 to 9 stimuli. When determining the absolute threshold, the observer gives a yes/no answer in each trial. When determining the differential threshold, the observer, comparing a test stimulus from a certain set with the standard presented in each trial, gives an answer in the form “more than/less than.” After preliminary testing of test stimuli, their set is formed so that they enclose the threshold in a fork and so that all of them (ideally) give a certain percentage of detection or discrimination reactions, but none of them is perceived in 100% of cases.
Psychophysical methods for scaling suprathreshold stimuli. They are a collection of a wide variety of methods, all of which have in common only the fact that they prescribe rules by which subjects (directly or indirectly) assign numerical scale values ​​to physical stimuli. These methods have often been used to test certain psychophysical laws. Among them are methods of bisection, subjectively equal intervals, fractionation and magnitude estimation. When using the bisection method, the subject is instructed to adjust the magnitude of the variable stimulus so that the resulting sensation seems to him equidistant from the sensations caused by two constant stimuli that set the boundaries of the interval that must be divided in half. This procedure is repeated many times, after which the arithmetic mean of the values ​​of the variable stimulus selected by the subject is calculated. The method of subjectively equal intervals - a variation of the category method - provides the observer with the opportunity to classify the presented stimuli into one of the “equally wide” categories, the number of which (for example, 5) is set by the experimenter and does not change during the experiment. The most extreme stimuli are presented first and identified as such to serve as reference points for subsequent judgments. After the observer has classified all the stimuli, their subjective values, defined as the average or median categories, are presented graphically as a function of the objective value of the stimulus. The fractionation method requires the observer to create (by adjustment or adjustment) a new stimulus on each trial that constitutes a predetermined portion (eg, half) of the stimulus presented to him. This is done for each of the stimuli included in the stimulus set. The magnitude estimation method is a widely used procedure that allows the observer to estimate the magnitude of stimuli by assigning numbers to them. Stimuli that are stronger compared to the reference stimulus usually receive larger numerical values, and weaker ones - smaller ones. For each stimulus, the arithmetic mean or geometric mean of the numerical ratings obtained on the group of subjects is calculated. The resulting averages of subjective assessments of the magnitude of the stimulus are presented graphically as a function of the actual magnitude of the stimulus. Signal Detection Theory Motivation, expectancy, and observer attitude cause measurement bias relative to the true value in psychophysical threshold experiments. Thus, when using the method of constant stimuli, in empty trials ("trap trials"), when the observer is not presented with any stimuli, "yes" answers still appear. This reaction is called a false alarm in signal detection theory (DST). Correct detection of the stimulus (answering “yes” if it is present) is called a hit. Changes in motivation, expectations, or attitudes may increase hit rates, but at the cost of increasing the false alarm rate. In each of the three main TOC methods - “yes - no”, assessment and forced choice - a random sequence of trials is specified (for example, 200), in which the signal is either given against the background of some other, random signals (trials “signal + noise"), or absent (samples of "pure noise"). When using the “yes-no” method, the observer’s task is to give the answer “yes” in trials with the presence of a signal and the answer “no” in trials with its absence. In the assessment procedure, the observer's reaction comes down to choosing from a given set of evaluation categories the one that reflects the degree of his confidence in the presence of a signal in a given sample. In a forced choice experiment, situations of choice from two or more alternatives are offered (for example, when observation intervals are separated in time), one and only one of which contains a signal plus noise. The observer must choose the one that most likely contains the signal. The influence of motivation, expectations and attitudes on the reactions of subjects in psychophysical experiments is interpreted as an observer criterion, assessed by the percentage of false alarms. This criterion can be influenced by changing the proportion of cued trials (and informing the observer accordingly), by instructing the observer to be more relaxed or, conversely, more attentive and accurate, or by changing the payoffs for possible responses. If the percentage of hits is plotted on the y-axis and the percentage of false alarms on the x-axis, the resulting points will correspond to various levels of the observer's criterion, and the curve drawn from them will be called the receiver operating characteristic curve. Different curves are generated by signals of different levels, while all points on one curve reflect the same level of detection ability of the observer. Thus, it becomes possible to distinguish between the effects of sensory and non-sensory factors. Applications In addition to being used to find answers to questions in theoretical psychophysics, various PMs are widely used to solve practical problems both in the field of psychology and beyond. Information about normal visual and auditory thresholds (and, to a slightly lesser extent, about the thresholds of other sensory organs) is taken into account when designing equipment and analyzing human factors in engineering psychology, and is also used in practical medicine as standards of comparison when making a clinical diagnosis. Scaling methods for suprathreshold stimuli are used in industry and commerce to assess preferences. TOC methods also find the widest application: from assessing the limits of “pure” sensory sensitivity to decision making in medicine.

Psychophysical laws. Bouguer - Weber, Weber - Fechner, Stevens, generalized psychophysical law
Basic psychophysical law. Based on Weber's law, Fechner made the assumption that barely noticeable differences in sensations can be considered equal, since they are all infinitesimal quantities, and taken as a unit of measure with which the intensity of sensations can be expressed numerically as the sum (or integral) of barely noticeable (infinitesimal) increases, counting from the threshold of absolute sensitivity. As a result, he received two series of variable quantities - the magnitudes of the stimuli and the corresponding magnitudes of sensations. Sensations grow in arithmetic progression when stimuli grow in geometric progression. The ratio of these two variables can be expressed in a logarithmic formula:
E = KlogJ + C,
where K and C are some constants. This formula, which determines the dependence of the intensity of sensations (in units of barely noticeable changes) on the intensity of the corresponding stimuli, is the so-called psychophysical Weber-Fechner law.
The sensitivity threshold corresponds to a point in sensory space. This point reflects the value of the stimulus at which the sensory system transitions from one state to another. In the case of an absolute threshold, it moves from the absence of sensation to the appearance of a barely noticeable sensation, in the case of a difference threshold, from the absence of a sensation of difference to the appearance of a sensation of difference. Thus, threshold measurements are point measurements. Their results may delineate the boundaries (range of changes in stimulus magnitude) within which the sensory system operates, but they say nothing about its structure.
The three most famous psychophysical laws are theoretical models of the structure of sensory space. These models are based on the empirical Bouguer-Weber law. On the border of the XVIII - XIX centuries. the French physicist Bouguer discovered a certain effect for the visual modality, and the German physiologist Weber tested its effect for other modalities. This effect lies in the fact that the ratio of the magnitude of a barely noticeable increase in the stimulus to its initial value remains constant over a very wide range of stimulus magnitudes, i.e.
R/R=k
This relationship is called the Bouguer-Weber law.
Fechner's law. In solving his problem about the relationship between the subjective and the objective, Fechner reasoned approximately as follows. Suppose that our sensory space consists of very small discrete elements e - barely noticeable distinctions. These elements are equal to each other, i.e. are constant:
e=k,
where k is a constant.
Taking into account the proportionality coefficient, the two constants can be equated to each other. Thus, the constant ratio of the Bouguer-Weber law can be equated to a constant associated with a subtle distinction:
R/R=Ke,
where K is the proportionality coefficient.
Next, Fechner took a step; from this equation connecting small quantities e and R, he moved to the differential equation
dR/R=K×dE
where dE is the differential corresponding to a very small value of e. The solution to this equation is the relation
E=C1×LnR+C2
where C1 and C2 are integration constants.
Let's define C2. The sensation begins with some stimulus value corresponding to the threshold (R1). When R=R1 the sensation is absent and appears only at the slightest excess of R over R1, i.e. in this case E=0. Let's substitute into the resulting solution:
О = C1 x InR1+C2,
hence C2 = - C1 x InR1, therefore,
E = C1 x InR- C1x In R1 = C1 x ln(R/ R1).
The relation E = C1x ln (R/ R1) is called Fechner's law or sometimes the Weber-Fechner law. Note that Fechner's law actively uses the concept of threshold. R1 is obviously an absolute threshold; e-elementary sensations, an analogue of the threshold of discrimination.
Stevens Law. American psychophysicist Stevens proposed his solution to the problem. His starting point was also the Bouguer-Weber law. But he imagined the model of sensory space differently. Stevens suggested that a relation similar to the Bouguer-Weber law in stimulus space operates in sensory space:
E/E=k
those. the announcement of a barely noticeable increment of sensation to its original value is a constant value. Again, up to a coefficient of proportionality, we can equate two constant quantities:
E/E=K R/R
Since Stevens did not postulate the discreteness of sensory space, he could quite correctly proceed to the differential equation
dE/E=dR/R
the solution to this equation E = k x Rn is called Stevens' law. The exponent n for each modality has its own value, but, as a rule, it is less than one.
American scientists R. and B. Tetsunyan proposed an explanation of the meaning of the exponent n. Let's create a system of equations for two extreme cases - minimum and maximum sensation:
Emin=k xRnmin xEmax=K x Rnmax
Let's take logarithms of both sides of the equation and get:
LnEmin=n x LnRmin+Lnk
LnEmax=n x LnRmax+Lnk
Having solved the system of equations for n, we obtain
n=(LnEmax-LnEmin)/Ln(Rmax-Rmin),
or
n=Ln(Emax/Emin)/Ln(Rmax/Rmin)
Thus, according to Tetsunian, the value of n for each modality determines the relationship between the range of sensations and the range of perceived stimuli.
For more than a hundred years, disputes between supporters of the logarithmic dependence of the strength of sensation on the magnitude of the stimulus (Fechner's law) and the power law (Stevens' law) have not ceased. The results of experiments with some modalities are better approximated by a logarithm, while with others - by a power function.
Let's consider one approach that reconciles these two extremes.
Generalized psychophysical law. Yu.M. Zabrodin proposed his explanation of the psychophysical relationship. The world of stimuli is again represented by the Bouguer-Weber law, and Zabrodin proposed the structure of sensory space in the following form:
E/Ez
those. added a constant. Hence the generalized psychophysical law is written:
dEz/E=dR/R
Obviously, at z = 0, the formula of the generalized law transforms into Fechner’s logarithmic law, and at z = 1, into Stevens’ power law. The value of this constant determines the degree of awareness of the subject about the goals, objectives and progress of the experiment. Fechner’s experiments involved “naive” subjects who found themselves in a completely unfamiliar experimental situation and knew nothing except the instructions about the upcoming experiment. This requirement for working with “naive” subjects follows, firstly, from Fechner’s postulation of the impossibility of a person making direct quantitative estimates of the magnitude of sensation, and secondly, from his hope to isolate in an experiment the work of the sensory system in its “pure” form, excluding the influence of other mental systems . Thus, in Fechner's law z = 0, which means the subjects are completely unaware.
Stevens solved more pragmatic problems. He was more interested in how a person perceives a sensory signal in real life, and not in abstract problems of the operation of the sensory system. He proved the possibility of direct estimates of the magnitude of sensations, the accuracy of which increases with proper training of the subjects. His experiments involved subjects who had undergone preliminary training and were trained to act in a situation of a psychophysical experiment. Therefore, in Stevens' law z = 1, which shows the complete awareness of the subject.
Zabrodin's generalized psychophysical law removes the contradiction between Stevens' and Fechner's laws, but for this he is forced to go beyond the paradigms of classical psychophysics. It is obvious that the concepts of “awareness” and “ignorance” refer to the work of integral mental formations, including the sensory system only as a channel for obtaining information about the external world.
Psychophysical laws establish connections between psychophysical correlates. In this case, the sensation is measured in physical quantities, i.e. in the meaning of the stimulus causing this sensation. For example, a sound pitch of one son (subjective value) corresponds to a sound frequency of 1000 Hz with a sound intensity of 40 dB (objective value). Psychophysical laws show how the space of stimuli (external stimuli) is transformed into sensory space. In this case, due to the type of transformation function (psychophysical law), a “compression” of the range of changes in stimulus values ​​occurs.
But in real life, pairs of psychophysical correlates almost never occur in their pure form. Even signals of one modality represent a very complex set of physical characteristics, the resulting value of which is not additive with respect to its components. This is clearly seen in the example of sound timbre, the physical correlate of which is the set of harmonics that make up the sound signal, and this characteristic cannot be measured on a simple physical scale. Without a physical scale, measurements of mental quantities lose their basis and “hang in the air.” How to be in this case? Classical psychophysics, limited by its two main paradigms, was unable to answer this question.

Psychophysical scaling
Psychophysical methods for scaling suprathreshold stimuli. They are a collection of a wide variety of methods, all of which have in common only the fact that they prescribe rules by which subjects (directly or indirectly) assign numerical scale values ​​to physical stimuli. These methods have often been used to test certain psychophysical laws.
Among them are methods of bisection, subjectively equal intervals, fractionation and magnitude estimation. When using the bisection method, the subject is instructed to adjust the magnitude of the variable stimulus so that the resulting sensation seems to him equidistant from the sensations caused by two constant stimuli that set the boundaries of the interval that must be divided in half. This procedure is repeated many times, after which the arithmetic mean of the values ​​of the variable stimulus selected by the subject is calculated.
The method of subjectively equal intervals - a variation of the category method - provides the observer with the opportunity to classify the presented stimuli into one of the “equally wide” categories, the number of which (for example, 5) is set by the experimenter and does not change during the experiment. The most extreme stimuli are presented first and identified as such to serve as reference points for subsequent judgments. After the observer has classified all the stimuli, their subjective values, defined as the average or median categories, are presented graphically as a function of the objective value of the stimulus.
The fractionation method requires the observer to create (by adjustment or adjustment) a new stimulus on each trial that constitutes a predetermined portion (eg, half) of the stimulus presented to him. This is done for each of the stimuli included in the stimulus set.
The magnitude estimation method is a widely used procedure that allows the observer to estimate the magnitude of stimuli by assigning numbers to them. Stimuli that are stronger compared to the reference stimulus usually receive larger numerical values, and weaker ones - smaller ones. For each stimulus, the arithmetic mean or geometric mean of the numerical ratings obtained on the group of subjects is calculated. The resulting averages of subjective assessments of the magnitude of the stimulus are presented graphically as a function of the actual magnitude of the stimulus.
Signal detection theory. Motivation, expectation, and observer attitudes cause measurement results to be biased relative to the true value in psychophysical threshold experiments. Thus, when using the method of constant stimuli, in empty trials ("trap trials"), when the observer is not presented with any stimuli, "yes" answers still appear. This reaction is called a false alarm in signal detection theory (DST). Correct detection of the stimulus (answering “yes” if it is present) is called a hit. Changes in motivation, expectations, or attitudes may increase hit rates, but at the cost of increasing the false alarm rate.

1. The history of experimental planning

Experimental design is a product of our time, but its origins are lost in the mists of time.

The origins of experimental planning go back to ancient times and are associated with numerical mysticism, prophecies and superstitions.

This is actually not planning a physical experiment, but planning a numerical experiment, i.e. arrangement of numbers so that certain strict conditions are met, for example, the equality of the sums along the rows, columns and diagonals of a square table, the cells of which are filled with numbers from the natural series.

Such conditions are fulfilled in magic squares, which, apparently, have primacy in the planning of the experiment.

According to one legend, around 2200 BC. Chinese Emperor Yu performed mystical calculations using a magic square, which was depicted on the shell of a divine turtle.

Emperor Yu Square

The cells of this square are filled with numbers from 1 to 9, and the sum of the numbers in rows, columns and main diagonals is 15.

In 1514, the German artist Albrecht Durer depicted a magic square in the right corner of his famous allegory engraving “Melancholy”. The two numbers in the lower horizontal row A5 and 14) represent the year the engraving was created. This was a kind of “application” of the magic square.

Durer square

For several centuries, the construction of magic squares occupied the minds of Indian, Arab, German, and French mathematicians.

Currently, magic squares are used when planning an experiment under conditions of linear drift, when planning economic calculations and preparing food rations, in coding theory, etc.

The construction of magic squares is a task of combinatorial analysis, the foundations of which in its modern understanding were laid by G. Leibniz. He not only examined and solved basic combinatorial problems, but also pointed out the great practical application of combinatorial analysis: to encoding and decoding, to games and statistics, to the logic of inventions and the logic of geometry, to the art of war, grammar, medicine, law, technology, etc. combinations of observations. The last area of ​​application is closest to experimental design.

One of the combinatorial problems, which is directly related to the planning of an experiment, was studied by the famous St. Petersburg mathematician L. Euler. In 1779, he proposed the problem of 36 officers as some kind of mathematical curiosity.

He posed the question whether it was possible to select 36 officers of 6 ranks from 6 regiments, one officer of each rank from each regiment, and arrange them in a square so that in each row and in each rank there would be one officer of each rank and one from each regiment . The problem is equivalent to constructing paired orthogonal 6x6 squares. It turned out that this problem cannot be solved. Euler suggested that there is no pair of orthogonal squares of order n=1 (mod 4).

Many mathematicians subsequently studied Euler's problem, in particular, and Latin squares in general, but almost none of them thought about the practical application of Latin squares.

Currently, Latin squares are one of the most popular methods of limiting randomization in the presence of discrete-type sources of inhomogeneity in experimental design. Grouping the elements of a Latin square, due to its properties (each element appears once and only once in each row and in each column of the square), allows you to protect the main effects from the influence of the source of inhomogeneities. Latin squares are also widely used as a means of reducing enumeration in combinatorial problems.

The emergence of modern statistical methods of experiment planning is associated with the name of R. Fisher.

In 1918, he began his famous series of works at the Rochemsted Agrobiological Station in England. In 1935, his monograph “Design of Experiments” appeared, which gave the name to the entire direction.

Among planning methods, the first was analysis of variance (by the way, Fisher also coined the term “variance”). Fisher created the basis of this method by describing complete ANOVA classifications (univariate and multivariate experiments) and partial ANOVA classifications without restriction and with restriction on randomization. At the same time, he made extensive use of Latin squares and flowcharts. Together with F. Yates, he described their statistical properties. In 1942, A. Kishen considered planning using Latin cubes, which was a further development of the theory of Latin squares.

Then R. Fischer independently published information about orthogonal hyper-Greco-Latin cubes and hyper-cubes. Soon after 1946–1947) R. Rao examined their combinatorial properties. The works of X. Mann (A947–1950) are devoted to the further development of the theory of Latin squares.

R. Fischer's research, carried out in connection with work on agrobiology, marks the beginning of the first stage in the development of experimental design methods. Fisher developed the factorial planning method. Yeggs proposed a simple computational scheme for this method. Factorial planning has become widespread. A feature of a full factorial experiment is the need to conduct a large number of experiments at once.

In 1945, D. Finney introduced fractional replicas from the factorial experiment. This allowed a sharp reduction in the number of experiments and paved the way for technical planning applications. Another possibility of reducing the required number of experiments was shown in 1946 by R. Plackett and D. Berman, who introduced saturated factorial designs.

In 1951, the work of American scientists J. Box and K. Wilson began a new stage in the development of experimental planning.

This work summarized the previous ones. It clearly formulated and brought to practical recommendations the idea of ​​sequential experimental determination of optimal conditions for carrying out processes using the estimation of coefficients of power-law expansions using the least squares method, movement along a gradient and finding an interpolation polynomial (power series) in the region of the extremum of the response function (“almost stationary” region) .

In 1954–1955 J. Box, and then J. Box and P. Yule showed that experimental design can be used in the study of physicochemical mechanisms of processes if one or more possible hypotheses are a priori stated. Here, experimental design intersected with chemical kinetics studies. It is interesting to note that kinetics can be considered as a method of describing a process using differential equations, the traditions of which go back to I. Newton. The description of a process by differential equations, called deterministic, is often contrasted with statistical models.

Box and J. Hunter formulated the principle of rotatability to describe the "nearly stationary" field, which is now developing into an important branch of the theory of experimental design. The same work shows the possibility of planning with partitioning into orthogonal blocks, previously indicated independently by de Baun.

A further development of this idea was planning, orthogonal to uncontrolled time drift, which should be considered as an important discovery in experimental technology - a significant increase in the capabilities of the experimenter.


2. Mathematical planning of experiments in scientific research

2.1 Basic concepts and definitions

By experiment we mean a set of operations performed on an object of study in order to obtain information about its properties. An experiment in which the researcher, at his discretion, can change the conditions of its conduct is called an active experiment. If the researcher cannot independently change the conditions of its conduct, but only registers them, then this is a passive experiment.

The most important task of methods for processing information obtained during an experiment is the task of constructing a mathematical model of the phenomenon, process, or object being studied. It can be used in process analysis and object design. It is possible to obtain a well-approximating mathematical model if an active experiment is purposefully used. Another task of processing the information obtained during the experiment is the optimization problem, i.e. finding such a combination of influencing independent variables that the selected optimality indicator takes an extreme value.

Experience is a separate experimental part.

Experimental plan – a set of data that determines the number, conditions and order of experiments.

Experimental planning is the selection of an experimental plan that meets specified requirements, a set of actions aimed at developing an experimentation strategy (from obtaining a priori information to obtaining a workable mathematical model or determining optimal conditions). This is purposeful control of an experiment, implemented under conditions of incomplete knowledge of the mechanism of the phenomenon being studied.

In the process of measurements, subsequent data processing, as well as formalization of the results in the form of a mathematical model, errors arise and some of the information contained in the original data is lost. The use of experimental planning methods makes it possible to determine the error of the mathematical model and judge its adequacy. If the accuracy of the model turns out to be insufficient, then the use of experimental planning methods makes it possible to modernize the mathematical model with additional experiments without losing previous information and with minimal costs.

The purpose of planning an experiment is to find such conditions and rules for conducting experiments under which it is possible to obtain reliable and reliable information about an object with the least amount of labor, as well as to present this information in a compact and convenient form with a quantitative assessment of accuracy.

1 Designs for one independent variable

The design of a “true” experimental study differs from others in the following important ways:

1) using one of the strategies for creating equivalent groups, most often randomization;

2) the presence of an experimental and at least one control group;

3) completion of the experiment by testing and comparing the behavior of the group that received the experimental intervention (X1) with the group that did not receive the intervention X0.

The classic version of the plan is the plan for 2 independent groups. In psychology, experimental planning began to be used in the first decades of the 20th century.

There are three main versions of this plan. When describing them, we will use the symbolization proposed by Campbell.

Table 5.1

Here R is randomization, X is exposure, O1 is testing the first group, O2 is testing the second group.

1) Two randomized group design with post-exposure testing. Its author is the famous biologist and statistician R. A. Fisher. The structure of the plan is shown in table. 5.1.

Equality between the experimental and control groups is an absolutely necessary condition for the application of this design. Most often, to achieve group equivalence, the randomization procedure is used (see Chapter 4). This plan is recommended for use when it is not possible or necessary to conduct preliminary testing of subjects. If the randomization is done well, then this design is the best and allows you to control most sources of artifacts; in addition, various variants of variance analysis are applicable to it.

After randomization or another procedure for equalizing groups, an experimental intervention is carried out. In the simplest version, only two gradations of the independent variable are used: there is an impact, there is no impact.

If it is necessary to use more than 1 level of exposure, then plans with several experimental groups (according to the number of exposure levels) and one control group are used.

If it is necessary to control the influence of one of the additional variables, then a design with 2 control groups and 1 experimental group is used. Measuring behavior provides material for comparing the 2 groups. Data processing is reduced to the use of traditional estimates for mathematical statistics. Let's consider the case when the measurement is carried out using an interval scale. Student's t-test is used to assess differences in group means. Differences in the variation of the measured parameter between the experimental and control groups are assessed using the F criterion. The corresponding procedures are discussed in detail in textbooks on mathematical statistics for psychologists.


Using a 2 randomized group design with post-exposure testing allows for major sources of internal validity (as defined by Campbell) to be controlled. Since there is no preliminary testing, the interaction effect of the testing procedure and the content of the experimental intervention and the testing effect itself are excluded. The plan allows you to control the influence of group composition, spontaneous dropout, the influence of background and natural development, the interaction of group composition with other factors, and also allows you to eliminate the regression effect due to randomization and comparison of data from the experimental and control groups. However, when conducting most pedagogical and social-psychological experiments, it is necessary to strictly control the initial level of the dependent variable, be it intelligence, anxiety, knowledge or the status of an individual in a group. Randomization is the best possible procedure, but it does not provide an absolute guarantee of the correct choice. When there is doubt about the results of randomization, a pretest design is used.

Table 5.2

2) Design for two randomized groups with pretest and posttest. Let's consider the structure of this plan (Table 5.2).

The pretest design is popular among psychologists. Biologists have more confidence in the randomization procedure. The psychologist knows very well that each person is unique and different from others, and subconsciously strives to capture these differences with the help of tests, not trusting the mechanical randomization procedure. However, the hypothesis of most psychological research, especially in the field of developmental psychology (“formative experiment”), contains a prediction of a certain change in an individual’s property under the influence of an external factor. Therefore, test-exposure-retest designs using randomization and a control group are very common.

In the absence of a group matching procedure, this design becomes a quasi-experimental design (discussed in Section 5.2).

The main source of artifacts that undermines the external validity of a procedure is the interaction of testing with experimental effects. For example, testing the level of knowledge on a certain subject before conducting an experiment on memorizing material can lead to the updating of initial knowledge and to a general increase in memorization productivity. This is achieved by updating mnemonic abilities and creating a memorization mindset.

However, with the help of this plan, other external variables can be controlled. The factor of “history” (“background”) is controlled, since in the interval between the first and second testing both groups are exposed to the same (“background”) influences. However, Campbell notes the need to control for “within-group events,” as well as the effect of non-simultaneous testing in both groups. In reality, it is impossible to ensure that the test and retest are carried out simultaneously in them. The design becomes quasi-experimental, for example:

Typically, non-simultaneous testing is controlled by two experimenters testing two groups simultaneously. The optimal procedure is to randomize the order of testing: testing members of the experimental and control groups is carried out in random order. The same is done with the presentation or non-presentation of experimental influence. Of course, such a procedure requires a significant number of subjects in the experimental and control samples (at least 30-35 people in each).

Natural history and testing effects are controlled by ensuring that they occur equally in the experimental and control groups, and group composition and regression effects [Campbell, 1980] are controlled by the randomization procedure.

The results of applying the test-exposure-retest plan are presented in the table.

When processing data, parametric tests t and F (for data on an interval scale) are usually used. Three t values ​​are calculated: comparison 1) O1 and O2; 2) O3 and O4; 3) O2 and O4. The hypothesis about the significant influence of the independent variable on the dependent variable can be accepted if two conditions are met: a) the differences between O1 and O2 are significant, and between O3 and O4 are insignificant, and b) the differences between O2 and O4 are significant. It is much more convenient to compare not absolute values, but the magnitude of the increase in indicators from the first test to the second (δ(i)). δ(i12) and δ(i34) are calculated and compared using Student's t-test. If the differences are significant, the experimental hypothesis about the influence of the independent variable on the dependent variable is accepted (Table 5.3).

It is also recommended to use Fisher analysis of covariance. In this case, the pre-test indicators are taken as an additional variable, and the subjects are divided into subgroups depending on the pre-test indicators. This results in the following table for data processing using the MANOVA method (Table 5.4).

The use of a “test-exposure-retest” design allows you to control the influence of “side” variables that violate the internal validity of the experiment.

External validity refers to the transferability of data to a real-life situation. The main point that distinguishes the experimental situation from the real one is the introduction of preliminary testing. As we have already noted, the “test-exposure-retest” design does not allow us to control the effect of the interaction of testing and experimental influence: the previously tested subject “sensitizes” - becomes more sensitive to the influence, since in the experiment we measure exactly the dependent variable that we are going to measure influence by varying the independent variable.

Table 5.5

To control external validity, R.L. Solomon’s plan, which he proposed in 1949, is used.

3) The Solomon plan is used when conducting an experiment on four groups:

1. Experiment 1: R O1 X O2

2. Control 1: R O3 O4

3. Experiment 2: R X O5

4. Control 2: R O6

The design includes a study of two experimental and two control groups and is essentially multi-group (2 x 2 type), but for ease of presentation it is discussed in this section.

Solomon's design is a combination of two previously discussed designs: the first, when no pretesting is carried out, and the second, test-exposure-retest. By using the "first part" of the design, the interaction effect of the first test and the experimental treatment can be controlled. Solomon, using his plan, reveals the effect of experimental exposure in four different ways: by comparing 1) O2 - O1; 2) O2 - O4; 3) O5 - O6 and 4) O5 - O3.

If we compare O6 with O1 and O3, we can identify the joint influence of the effects of natural development and “history” (background influences) on the dependent variable.

Campbell, criticizing the data processing schemes proposed by Solomon, suggests not paying attention to preliminary testing and reducing the data to a 2 x 2 scheme, suitable for applying variance analysis (Table 5.5).

Comparison of column averages makes it possible to identify the effect of experimental influence - the influence of the independent variable on the dependent one. Row means show the pretest effect. Comparison of cell means characterizes the interaction of the testing effect and the experimental effect, which indicates the extent of the violation of external validity.

In the case where the effects of preliminary testing and interaction can be neglected, proceed to the comparison of O4 and O2 using the method of covariance analysis. As an additional variable, data from preliminary testing is taken according to the scheme given for the “test-exposure-retest” plan.

Finally, in some cases it is necessary to check the persistence of the effect of the independent variable on the dependent variable over time: for example, to find out whether a new teaching method leads to long-term memorization of the material. For these purposes, the following plan is used:

1 Experiment 1 R O1 X O2

2 Control 1 R O3 O4

3 Experiment 2 R O5 X O6

4 Control 2 R O7 O8

2. Designs for one independent variable and several groups

Sometimes comparing two groups is not enough to confirm or refute an experimental hypothesis. This problem arises in two cases: a) when it is necessary to control external variables; b) if it is necessary to identify quantitative relationships between two variables.

To control external variables, various versions of the factorial experimental design are used. As for identifying a quantitative relationship between two variables, the need to establish it arises when testing an “exact” experimental hypothesis. In an experiment involving two groups, at best, it is possible to establish the fact of a causal relationship between the independent and dependent variables. But between two points you can draw an infinite number of curves. To ensure that there is a linear relationship between two variables, you must have at least three points corresponding to the three levels of the independent variable. Therefore, the experimenter must select several randomized groups and place them in different experimental conditions. The simplest option is a design for three groups and three levels of the independent variable:

Experiment 1: R X1 O1

Experiment 2: R X2 O2

Control: R O3

The control group in this case is the third experimental group, for which the level of the variable X = 0.

In this design, each group is presented with only one level of the independent variable. It is also possible to increase the number of experimental groups according to the number of levels of the independent variable. To process the data obtained using such a plan, the same statistical methods are used as listed above.

Simple “system experimental designs” are, surprisingly, very rarely used in modern experimental research. Maybe researchers are “embarrassed” to put forward simple hypotheses, remembering the “complexity and multidimensionality” of mental reality? The tendency to use designs with many independent variables, indeed to conduct multivariate experiments, does not necessarily contribute to a better explanation of the causes of human behavior. As you know, “a smart person amazes with the depth of his idea, and a fool with the scope of his construction.” It is better to prefer a simple explanation to any complex one, although regression equations where everything equals everything and intricate correlation graphs may impress some dissertation committees.

3 Factorial designs

Factorial experiments are used when it is necessary to test complex hypotheses about the relationships between variables. The general form of such a hypothesis is: “If A1, A2,..., An, then B.” Such hypotheses are called complex, combined, etc. In this case, there can be various relationships between independent variables: conjunction, disjunction, linear independence, additive or multiplicative, etc. Factorial experiments are a special case of multivariate research, during which they try to establish relationships between several independent and several dependent variables. In a factorial experiment, as a rule, two types of hypotheses are tested simultaneously:

1) hypotheses about the separate influence of each of the independent variables;

2) hypotheses about the interaction of variables, namely, how the presence of one of the independent variables affects the effect on the other.

A factorial experiment is based on a factorial design. Factorial design of an experiment involves combining all levels of independent variables with each other. The number of experimental groups is equal to the number of combinations of levels of all independent variables.

Today, factorial designs are the most common in psychology, since simple relationships between two variables practically do not occur in it.

There are many options for factorial designs, but not all are used in practice. The most commonly used factorial designs are for two independent variables and two levels of the 2x2 type. To draw up a plan, the principle of balancing is applied. A 2x2 design is used to identify the effect of two independent variables on one dependent variable. The experimenter manipulates possible combinations of variables and levels. The data is shown in a simple table (Table 5.6).

Less commonly used are four independent randomized groups. To process the results, Fisher's analysis of variance is used.

Other versions of the factorial design, namely 3x2 or 3x3, are also rarely used. The 3x2 design is used in cases where it is necessary to establish the type of dependence of one dependent variable on one independent variable, and one of the independent variables is represented by a dichotomous parameter. An example of such a plan is an experiment to identify the impact of external observation on the success of solving intellectual problems. The first independent variable varies simply: there is an observer, there is no observer. The second independent variable is task difficulty levels. In this case, we get a 3x2 plan (Table 5.7).

The 3x3 design option is used if both independent variables have several levels and it is possible to identify the types of relationships between the dependent variable and the independent ones. This plan makes it possible to identify the influence of reinforcement on the success of completing tasks of varying difficulty (Table 5.8).

In general, the design for two independent variables looks like N x M. The applicability of such plans is limited only by the need to recruit a large number of randomized groups. The amount of experimental work increases excessively with the addition of each level of any independent variable.

Designs used to examine the effects of more than two independent variables are rarely used. For three variables they have the general form L x M x N.

Most often, 2x2x2 plans are used: “three independent variables - two levels.” Obviously, adding each new variable increases the number of groups. Their total number is 2, where n is the number of variables in the case of two intensity levels and K - in the case of K-level intensity (we assume that the number of levels is the same for all independent variables). An example of this plan could be a development of the previous one. In the case when we are interested in the success of completing an experimental series of tasks, which depends not only on general stimulation, which is carried out in the form of punishment - electric shock, but also on the ratio of reward and punishment, we use a 3x3x3 plan.

A simplification of a complete plan with three independent variables of the form L x M x N is planning using the “Latin square” method. "Latin square" is used when it is necessary to study the simultaneous influence of three variables that have two or more levels. The Latin square principle is that two levels of different variables occur only once in an experimental design. This greatly simplifies the procedure, not to mention the fact that the experimenter is spared the need to work with huge samples.

Suppose we have three independent variables, with three levels each:

The plan using the “Latin square” method is presented in table. 5.9.

The same technique is used to control external variables (counterbalancing). It is easy to notice that the levels of the third variable N (A, B, C) occur once in each row and in each column. By combining results across rows, columns, and levels, it is possible to identify the influence of each of the independent variables on the dependent variable, as well as the degree of pairwise interaction between the variables.

"Latin Square" allows you to significantly reduce the number of groups. In particular, the 2x2x2 plan turns into a simple table (Table 5.10).

The use of Latin letters in cells to indicate the levels of the 3rd variable (A - yes, B - no) is traditional, which is why the method is called “Latin square”.

A more complex plan using the “Greco-Latin square” method is used very rarely. It can be used to study the influence of four independent variables on a dependent variable. Its essence is as follows: to each Latin group of a plan with three variables, a Greek letter is added, indicating the levels of the fourth variable.

Let's look at an example. We have four variables, each with three intensity levels. The plan using the “Greco-Latin square” method will take the following form (Table 5.11).

The Fisher analysis of variance method is used to process the data. The methods of the “Latin” and “Greco-Latin” square came to psychology from agrobiology, but were not widely used. The exception is some experiments in psychophysics and the psychology of perception.

The main problem that can be solved in a factorial experiment and cannot be solved using several ordinary experiments with one independent variable is determining the interaction of two variables.

Let's consider the possible results of the simplest 2x2 factorial experiment from the standpoint of interactions of variables. To do this, we need to present the results of the experiments on a graph, where the values ​​of the first independent variable are plotted along the abscissa axis, and the values ​​of the dependent variable are plotted along the ordinate axis. Each of the two straight lines connecting the values ​​of the dependent variable at different values ​​of the first independent variable (A) characterizes one of the levels of the second independent variable (B). For simplicity, let us apply the results of a correlation study rather than an experimental one. Let us agree that we examined the dependence of a child’s status in a group on his state of health and level of intelligence. Let's consider options for possible relationships between variables.

First option: the lines are parallel - there is no interaction of variables (Fig. 5.1).

Sick children have a lower status than healthy children, regardless of their level of intelligence. Intellectuals always have a higher status (regardless of health).

The second option: physical health with a high level of intelligence increases the chance of receiving a higher status in the group (Figure 5.2).

In this case, the effect of divergent interaction between two independent variables is obtained. The second variable enhances the influence of the first on the dependent variable.

Third option: convergent interaction - physical health reduces the chance of an intellectual to acquire a higher status in the group. The “health” variable reduces the influence of the “intelligence” variable on the dependent variable. There are other cases of this interaction option:

the variables interact in such a way that an increase in the value of the first leads to a decrease in the influence of the second with a change in the sign of the dependence (Fig. 5.3).

Sick children with a high level of intelligence are less likely to receive a high status than sick children with low intelligence, while healthy children have a positive relationship between intelligence and status.

It is theoretically possible to imagine that sick children would have a greater chance of achieving high status with a high level of intelligence than their healthy, low-intelligence peers.

The last, fourth, possible variant of the relationships between independent variables observed in research: the case when there is an overlapping interaction between them, presented in the last graph (Fig. 5.4).

So, the following interactions of variables are possible: zero; divergent (with different signs of dependence); intersecting.

The magnitude of the interaction is assessed using analysis of variance, and Student's t test is used to assess the significance of group X differences.

In all considered experimental design options, a balancing method is used: different groups of subjects are placed in different experimental conditions. The procedure for equalizing the composition of groups allows for comparison of results.

However, in many cases it is necessary to design an experiment so that all participants receive all possible exposures to independent variables. Then the counterbalancing technique comes to the rescue.

McCall calls plans that implement the “all subjects, all treatments” strategy “rotation experiments,” and Campbell calls them “balanced plans.” To avoid confusion between the concepts of “balancing” and “counter-balancing”, we will use the term “rotation plan”.

Rotation plans are constructed using the “Latin square” method, but, unlike the example discussed above, the rows indicate groups of subjects, not the levels of the variable, the columns indicate the levels of influence of the first independent variable (or variables), and the cells of the table indicate the levels of influence of the second independent variable.

An example of an experimental design for 3 groups (A, B, C) and 2 independent variables (X,Y) with 3 intensity levels (1st, 2nd, 3rd) is given below. It is easy to see that this plan can be rewritten so that the cells contain the levels of the Y variable (Table 5.12).

Campbell includes this design as a quasi-experimental design on the basis that it is unknown whether it controls for external validity. Indeed, it is unlikely that in real life a subject can receive a series of such influences as in the experiment.

As for the interaction of group composition with other external variables, sources of artifacts, randomization of groups, according to Campbell, should minimize the influence of this factor.

Column sums in a rotation design indicate differences in the effect size for different values ​​of one independent variable (X or Y), and row sums should indicate differences between groups. If the groups are randomized successfully, then there should be no differences between groups. If the composition of the group is an additional variable, it becomes possible to control it. The counterbalancing scheme does not avoid the training effect, although data from numerous experiments using the Latin square do not allow such a conclusion.

Summarizing the consideration of various options for experimental plans, we propose their classification. Experimental designs differ on the following grounds:

1. Number of independent variables: one or more. Depending on their number, either a simple or factorial design is used.

2. The number of levels of independent variables: with 2 levels we are talking about establishing a qualitative connection, with 3 or more - a quantitative connection.

3. Who gets the impact. If the scheme “each group has its own combination” is used, then we are talking about an intergroup plan. If the “all groups - all influences” scheme is used, then we are talking about a rotation plan. Gottsdanker calls it cross-individual comparison.

The design of an experiment can be homogeneous or heterogeneous (depending on whether the number of independent variables is equal or not equal to the number of levels of their change).

4 Experimental plans for one subject

Experiments on samples with control of variables are a situation that has been widely used in psychology since the 1910-1920s. Experimental studies on equalized groups became especially widespread after the creation by the outstanding biologist and mathematician R. A. Fisher of the theory of planning experiments and processing their results (analysis of variance and covariance). But psychologists used experimentation long before the theory of sample design. The first experimental studies were carried out with the participation of one subject - he was the experimenter himself or his assistant. Starting with G. Fechner (1860), the technique of experimentation came to psychology to test theoretical quantitative hypotheses.

The classic experimental study of one subject was the work of G. Ebbinghaus, which was carried out in 1913. Ebbinghaus investigated the phenomenon of forgetting by learning nonsense syllables (which he himself invented). He learned a series of syllables and then tried to reproduce them after a certain time. As a result, a classic forgetting curve was obtained: the dependence of the volume of stored material on the time elapsed since memorization (Fig. 5.5).

In empirical scientific psychology, three research paradigms interact and struggle. Representatives of one of them, traditionally coming from natural science experiments, consider the only reliable knowledge to be that obtained in experiments on equivalent and representative samples. The main argument of supporters of this position is the need to control external variables and level out individual differences in order to find general patterns.

Representatives of the methodology of “experimental analysis of behavior” criticize supporters of statistical analysis and design of experiments on samples. In their opinion, it is necessary to conduct studies with the participation of one subject and using certain strategies that will allow the sources of artifacts to be reduced during the experiment. Proponents of this methodology are such famous researchers as B.F. Skinner, G.A. Murray and others.

Finally, classical idiographic research is contrasted with both single-subject experiments and designs that study behavior in representative samples. Idiographic research involves the study of individual cases: biographies or behavioral characteristics of individual people. An example is Luria’s wonderful works “The Lost and Returned World” and “A Little Book of a Big Memory.”

In many cases, single-subject studies are the only option. Single subject research methodology was developed in the 1970s and 1980s. many authors: A. Kasdan, T. Kratochwill, B.F. Skinner, F.-J. McGuigan et al.

During the experiment, two sources of artifacts are identified: a) errors in the planning strategy and in the conduct of the study; b) individual differences.

If you create the “correct” strategy for conducting an experiment with one subject, then the whole problem will come down to only taking into account individual differences. An experiment with one subject is possible when: a) individual differences can be neglected in relation to the variables studied in the experiment, all subjects are considered equivalent, so data can be transferred to each member of the population; b) the subject is unique, and the problem of direct data transfer is irrelevant.

The single-subject experimentation strategy was developed by Skinner to study learning. Data during the study are presented in the form of “learning curves” in the coordinate system “time” - “total number of responses” (cumulative curve). The learning curve is initially analyzed visually; its changes over time are considered. If the function describing the curve changes when the influence of A on B changes, then this may indicate the presence of a causal dependence of behavior on external influences (A or B).

Single-subject research is also called time series design. The main indicator of the influence of the independent variable on the dependent variable when implementing such a plan is the change in the nature of the subject’s responses due to the impact on him of changes in experimental conditions over time. There are a number of basic schemes for applying this paradigm. The simplest strategy is the A-B scheme. The subject initially performs the activity in conditions A, and then in conditions B (see Fig. 5.8).

When using this plan, a natural question arises: would the response curve have retained its previous form if there had been no impact? Simply put, this design does not control for the placebo effect. In addition, it is unclear what led to the effect: perhaps it was not variable B that had the effect, but some other variable not taken into account in the experiment.

Therefore, another scheme is more often used: A-B-A. Initially, the subject's behavior is recorded under conditions A, then the conditions change (B), and at the third stage the previous conditions return (A). The change in the functional relationship between the independent and dependent variables is studied. If, when conditions change at the third stage, the previous type of functional relationship between the dependent and dependent variables is restored, then the independent variable is considered a cause that can modify the behavior of the subject (Fig. 5.9).

However, both the first and second options for planning time series do not allow taking into account the factor of cumulation of impacts. Perhaps a combination—a sequence of conditions (A and B)—leads to the effect. It is also not obvious that after returning to situation B the curve will take the same form as it was when conditions B were first presented.

An example of a design that reproduces the same experimental effect twice is the A-B-A-B design. If, during the 2nd transition from conditions A to conditions B, a change in the functional dependence of the subject’s responses on time is reproduced, then this will become evidence of the experimental hypothesis: the independent variable (A, B) influences the subject’s behavior.

Let's consider the simplest case. We will choose the student’s total knowledge as the dependent variable. As an independent activity - physical education classes in the morning (for example, wushu gymnastics). Let us assume that the Wushu complex has a beneficial effect on the student’s general mental state and promotes better memorization (Fig. 5.10).

It is obvious that gymnastics had a beneficial effect on learning ability.

There are various options for planning using the time series method. There are schemes of regular alternation of series (AB-AB), series of stochastic sequences and positional adjustment schemes (example: ABBA). Modifications of the A-B-A-B scheme are the A-B-A-B-A scheme or a longer one: A- B- A- B- A- B- A.

The use of longer time frames increases the confidence of detecting an effect, but leads to subject fatigue and other cumulative effects.

In addition, the A-B-A-B plan and its various modifications do not solve three major problems:

1. What would happen to the subject if there was no effect (placebo effect)?

2. Isn't the sequence of influences A-B itself another influence (collateral variable)?

3. What cause led to the effect: if there were no effect at place B, would the effect be repeated?

To control for the placebo effect, the A-B-A-B series includes conditions that “simulate” either exposure A or exposure B. Let's consider a solution to the last problem. But first, let’s analyze this case: let’s say a student constantly practices wushu. But from time to time a pretty girl (just a spectator) appears at the stadium or in the gym - impact B. Plan A-B-A-C revealed an increase in the effectiveness of the student’s educational activities during the periods when variable B appears. What is the reason: the presence of a spectator as such or a specific pretty girl girls? To test the hypothesis about the presence of a specific cause, the experiment is structured according to the following scheme: A-B-A-C-A. For example, in the fourth time period another girl or a bored pensioner comes to the stadium. If the effectiveness of classes decreases significantly (not the same motivation), then this will indicate a specific reason for the deterioration in learning ability. It is also possible to test the impact of condition A (wushu classes without spectators). To do this, you need to apply the A-B-C-B plan. Let the student stop studying for some time in the absence of the girl. If her repeated appearance at the stadium leads to the same effect as the first time, then the reason for the increase in performance is in her, and not just in wushu classes (Fig. 5.11).

Please do not take this example seriously. In reality, just the opposite happens: infatuation with girls sharply reduces student performance.

There are many techniques for conducting single-subject studies. An example of the development of an A-B plan is the “alternative impact plan.” Exposures A and B are randomly distributed over time, for example by day of the week, if we are talking about different methods of quitting smoking. Then all the moments when there was impact A are determined; a curve is constructed connecting the corresponding successive points. All moments in time when there was an “alternative” influence B are identified and, in order of sequence in time, also connected; the second curve is constructed. Then both curves are compared and it is determined which effect is more effective. Efficiency is determined by the magnitude of the rise or fall of the curve (Fig. 5.12).

Synonyms for the term “alternative impact plan” are: “series comparison plan”, “synchronized impact plan”, “multiple schedule plan”, etc.

Another option is the reverse plan. It is used to study two alternative forms of behavior. Initially, a baseline level of manifestation of both forms of behavior is recorded. The first behavior can be actualized with the help of a specific influence, and the second, incompatible with it, is simultaneously provoked by another type of influence. The effect of two interventions is assessed. After a certain time, the combination of influences is reversed so that the first form of behavior receives the influence that initiated the second form of behavior, and the second - the influence relevant to the first form of behavior. This design is used, for example, in studying the behavior of young children (Fig. 5.13).

In the psychology of learning, the method of changing criteria, or the “plan of increasing criteria,” is used. Its essence is that a change in the behavior of the subject is recorded in response to an increase (phase) of influence. The increase in the registered behavior parameter is recorded, and the next impact is carried out only after the subject reaches the specified criterion level. After the level of performance has stabilized, the subject is presented with the following gradation of influence. The curve of a successful experiment (confirming a hypothesis) resembles a staircase knocked down by heels, where the beginning of the step coincides with the beginning of the level of influence, and its end with the subject reaching the next criterion.

A way to level out the “sequence effect” is to invert the sequence of influences - plan A-B-B-A. Sequence effects are associated with the influence of a previous influence on a subsequent one (another name is order effects, or transfer effects). Transfer can be positive or negative, symmetrical or asymmetrical. The sequence A-B-B-A is called a positionally equalized circuit. As Gottsdanker notes, the effects of variables A and B are due to early or late carryover effects. Exposure A is associated with late transfer, while B is associated with early transfer. In addition, if a cumulative effect is present, then two consecutive exposures to B may affect the subject as a single cumulative exposure. An experiment can only be successful if these effects are insignificant. The variants of plans discussed above with regular alternation or with random sequences are most often very long, so they are difficult to implement.

To summarize briefly, we can say that the schemes for presenting the influence are used depending on the capabilities that the experimenter has.

A random sequence of influences is obtained by randomizing tasks. It is used in experiments requiring a large number of samples. Random alternation of influences guarantees against the manifestation of sequence effects.

For a small number of samples, a regular alternation scheme of type A-B-A-B is recommended. Attention should be paid to the periodicity of background influences, which may coincide with the action of the independent variable. For example, if you give one intelligence test in the morning, and the second one always in the evening, then under the influence of fatigue the effectiveness of the second test will decrease.

A positionally equalized sequence can be suitable only when the number of influences (tasks) is small and the influence of early and late transfer is insignificant.

But none of the schemes excludes the manifestation of differential asymmetric transfer, when the influence of previous exposure A on the effect of exposure B is greater than the influence of previous exposure B on the effect of exposure A (or vice versa).

A variety of designs for one subject were summarized by D. Barlow and M. Hersen in the monograph “Experimental designs for single cases” (Single case experimental designs, 1984) (Table 5.13).

Table 5.13

Major artifacts in a single-subject study are virtually unavoidable. It is difficult to imagine how the effects associated with the irreversibility of events can be eliminated. If the effects of order or interaction of variables are to some extent controllable, then the already mentioned effect of asymmetry (differential transfer) cannot be eliminated.

No less problems arise when establishing the initial level of intensity of the recorded behavior (the level of the dependent variable). The initial level of aggressiveness that we recorded in a child in a laboratory experiment may be atypical for him, since it was caused by recent previous events, for example, a quarrel in the family, suppression of his activity by peers or teachers in kindergarten.

The main problem is the possibility of transferring the results of the study of one subject to each of the representatives of the population. We are talking about taking into account individual differences that are significant for the study. Theoretically, the following move is possible: presentation of individual data in a “dimensionless” form; in this case, individual parameter values ​​are normalized to a value equal to the spread of values ​​in the population.

Let's look at an example. In the early 1960s. in the laboratory of B. N. Teplov, a problem arose: why are all the graphs describing changes in reaction time depending on the intensity of the stimulus different for the subjects? V. D. Nebylitsyn [Nebylitsyn V. D., 1966] proposed presenting to the subjects a signal that does not change in units of physical intensity, but in units of a previously measured individual absolute threshold (“one threshold”, “two thresholds”, etc.). The results of the experiment brilliantly confirmed Nebylitsyn’s hypothesis: the curves of the dependence of reaction time on the level of influence, measured in units of the individual absolute threshold, turned out to be identical for all subjects.

A similar scheme is used when interpreting data. At the Institute of Psychology of the Russian Academy of Sciences, A. V. Drynkov conducted research into the process of formation of simple artificial concepts. The learning curves showed the dependence of success on time. They turned out to be different for all subjects: they were described by power functions. Drynkov suggested that normalizing individual indicators to the value of the initial level of training (along the Y axis) and to the individual time to achieve the criterion (along the X axis) makes it possible to obtain a functional dependence of success on time, the same for all subjects. This was confirmed: the indicators of changes in the individual results of the subjects, presented in a “dimensionless” form, obeyed the quadratic power law.

Consequently, identifying a general pattern by leveling individual differences is decided each time on the basis of a meaningful hypothesis about the influence of an additional variable on the interindividual variation in the results of the experiment.

Let us dwell once again on one feature of experiments with the participation of one subject. The results of these experiments are very dependent on the experimenter's preconceptions and the relationship that develops between him and the subject. When conducting a long series of sequential influences, the experimenter can unconsciously or consciously act in such a way that the subject actualizes behavior that confirms the experimental hypothesis. That is why in this kind of research it is recommended to use “blind experiments” and “double-blind experiments”. In the first option, the experimenter knows, but the subject does not know, when the latter receives the placebo and when the effect. A “double-blind” experiment is where the experiment is conducted by a researcher who is unfamiliar with the hypothesis and does not know when the subject receives a placebo or treatment.

Experiments involving one subject play an important role in psychophysiology, psychophysics, learning psychology, and cognitive psychology. The methodology of such experiments has penetrated into the psychology of programmed training and social management, into clinical psychology, especially into behavioral therapy, the main promoter of which is Eysenck [Eysenck G. Yu., 1999].

Reliability and accuracy in the research, to provide for nuances that are difficult to follow during everyday “spontaneous experimentation.” Often, in order to adjust the plan, experimenters conduct a so-called pilot, or trial, study, which can be considered as a “draft” of a future scientific experiment.

Encyclopedic YouTube

    1 / 5

    Experimental psychology

    Central Composite Design (DOE Experimental Design)

    Social Psychology. Modern fascism in Jones' "Third Wave" experiment

    Psychological content of Augustinavichiute-Reinin signs. What the experiment showed (and more)

    BBC - He and She - Secrets of relationships. Part 1

    Subtitles

Basic Questions Answered by an Experimental Design

An experimental design is created to answer basic questions about:

One of the most important questions that an experimental design must answer is to determine in what sequence the changes in the stimuli under consideration (independent variables) affecting the dependent variable should occur. Such an effect can vary from a simple scheme “A 1 -A 2”, where A 1 is the first value of the stimulus, A 2 is the second value of the stimulus, to more complex ones, such as “A 1 -A 2 -A 1 -A 2” , etc. The sequence of presentation of stimuli is a very important issue that directly relates to maintaining the validity of the study: for example, if a person is constantly presented with the same stimulus, he may become less susceptible to it.

Planning stages

Planning includes two stages:

  1. Content planning of the experiment:
    • Determination of a number of theoretical and experimental provisions that form the theoretical basis of the study.
    • Formulation of theoretical and experimental research hypotheses.
    • Selecting the required experimental method.
    • Solution to the issue of sampling subjects:
      • Determining the composition of the sample.
      • Determining the sample size.
      • Determining the sampling method.
  2. Formal experimental planning:
    • Achieving the ability to compare results.
    • Achieving the possibility of discussing the data obtained.
    • Ensuring that research is carried out cost-effectively.

The main goal of formal planning is to eliminate the maximum possible number of reasons for distorting the results.

Types of plans

Simple plans

Simple plans, or one-factor, involve studying the influence of only one independent variable on the dependent variable. The advantage of such designs is their effectiveness in establishing the influence of the independent variable, as well as the ease of analysis and interpretation of the results. The disadvantage is the inability to draw a conclusion about the functional relationship between the independent and dependent variables.

Experiments with reproducible conditions

Designs for Multilevel Experiments

When experiments use one independent variable, a situation where only two of its values ​​are studied is considered the exception rather than the rule. In most univariate studies there are three or more values ​​of the independent variable, these designs are often called single-factor multilevel. Such designs can be used both to study nonlinear effects (that is, cases where the independent variable takes on more than two values) and to test alternative hypotheses. The advantage of such plans is the ability to determine the type of functional relationship between the independent and dependent variables. The disadvantage, however, is that it is time consuming and requires more participants.

Factorial designs

Factorial designs involve the use of more than one independent variable. There can be any number of such variables, or factors, but they are usually limited to using two, three, or less often four.

Factorial designs are described using a numbering system showing the number of independent variables and the number of values ​​(levels) each variable takes. For example, a 2x3 (“two by three”) factorial design has two independent variables (factors), the first of which takes two values ​​(“2”), and the second takes three values ​​(“3”); The 3x4x5 factorial design has three independent variables, taking “3”, “4” and “5” values ​​respectively.

In an experiment conducted using a 2x2 factorial design, let's say one factor, A, can take two values ​​- A 1 and A 2, and another factor, B, can take the values ​​B 1 and B 2. During the experiment, according to the 2x2 plan, four experiments should be carried out:

  1. A 1 B 1
  2. A 1 B 2
  3. A 2 B 1
  4. A 2 B 2

The order of experiments may be different depending on the expediency determined by the tasks and conditions of each specific experiment.

Quasi-experimental designs

Quasi-experimental designs- designs for experiments in which, due to incomplete control of variables, it is impossible to draw conclusions about the existence of a cause-and-effect relationship. The concept of a quasi-experimental design was introduced by Campbell and Stanley in Experimental and quasi-experimental designs for research (Cambell, D. T. & Stanley, J. C., ). This was done in order to overcome some of the problems faced by psychologists who wished to conduct research in a less restrictive setting than the laboratory. Quasi-experimental designs are often used in applied psychology.

Types of quasi-experimental designs:

1. Experimental designs for non-equivalent groups

2. Plans of discrete time series.

Types:

1. Time Series Design Experiment

2. Plan of series of time samples

3. Plan of series of equivalent impacts

4. Design with non-equivalent control group

5. Balanced plans.

Ex post facto plans

Studies in which data are collected and analyzed after the event has already occurred, called research ex post facto , many experts classify them as quasi-experimental. Such research is often carried out in sociology, pedagogy, clinical psychology and neuropsychology. The essence of the study ex post facto consists in the fact that the experimenter himself does not influence the subjects: the influence is some real event from their lives.

In neuropsychology, for example, for a long time (and even today) research was based on the paradigm of localization, which is expressed in the “locus - function” approach and states that lesions of certain structures make it possible to identify the localization of mental functions - the specific material substrate in which they “are located.” ", in the brain [see A. R. Luria, “Brain lesions and cerebral localization of higher functions”; Such studies can be classified as studies ex post facto.

When planning a study ex post facto simulates a rigorous experimental design with equalization or randomization of groups and post-exposure testing.

Small N Experimental Designs

Small N plans also called "single-subject designs" because the behavior of each subject is considered individually. One of the main reasons for using small N experiments is considered to be the impossibility in some cases of applying results obtained from generalizations to large groups of people to any of the participants individually (thus leading to a violation of individual validity).

Correlation study- research conducted to confirm or refute a hypothesis about a statistical relationship (correlation) between several (two or more) variables. The design of such a study differs from a quasi-experimental one in that it lacks a controlled influence on the object of study.

In a correlational study, the scientist hypothesizes the existence of a statistical connection between several mental properties of an individual or between certain external levels and mental states, while assumptions about causal dependence are not discussed. Subjects must be in equivalent unaltered conditions. In general terms, the design of such a study can be described as PxO (“subjects” x “measurements”).

Types of correlation studies

  • Comparison of two groups
  • One-dimensional study
  • Correlation study of pairwise equivalent groups
  • Multivariate correlation study
  • Structural correlation study
  • Longitudinal correlation study*

* Longitudinal research is considered an intermediate option between quasi-experiment and correlational research.