This paper gives an account of the results of an investigation into one-dimensional systematic sampling, i.e. the sampling of sequences of quantitative values by the use of sampling points equally spaced along the sequence. New methods, using what are termed partial systematic samples, are evolved for estimating the systematic sampling error from short sections of sequences of completely enumerated numerical material. This gets over the difficulty, which previously existed, that the only estimates of the systematic sampling error of a numerical sequence, even when completely enumerated, were those provided by the actual deviations of the systematic samples of the whole sequence. Such deviations are few in number and by no means independent. Simple end-corrections are proposed for eliminating the errors, due to trend, which are otherwise inherent in randomly located systematic samples. It is demonstrated that it is impossible to make any fully reliable estimate of the sampling error from the systematic sampling results themselves, though if the continuous components of variation are not too marked, the sum of sets of terms taken alternately positive and negative, with suitable end adjustments, will provide a moderately satisfactory estimate, which will always be an overestimate provided there are no periodicities. This estimate is substantially better than the customary estimate based on successive differences. In other cases supplementary sampling is required to furnish an estimate of error, and methods are described whereby estimates can be derived from supplementary samples at half-spacing, or at half and quarter spacing. The performance of systematic sampling is investigated theoretically for certain mathematical functions, and also by the numerical analysis of certain numerical sequences. The mathematical functions investigated are (1) the two-valued function,/ ( a?) = 0 or 1, corresponding to sampling for attributes, (2) the normal error function, which corresponds to sampling for density with material normally distributed about a point in a line, and (3) the one-term autoregressive function yr+1=by?+a?? In the case of the two-valued function the relative performance of systematic and random samples is shown to depend on the lengths of the intervals of the function relative to the sampling interval. If these are small all forms of sampling are about of equal accuracy, but if they are large, systematic sampling is on the average twice as accurate as random sampling with one point per block, which is again twice as accurate as random sampling with two points per block. Similar results hold for the autoregressive function when b-*■ 1. In the case of the normal function, numerical analysis shows that systematic sampling over the whole of the curve is remarkably accurate in determining the integral of the curve. Mathematical reasons why this should be so are put forward. The sampling of part of the curve by systematic sampling is also investigated, and is used to demonstrate the value of end-corrections. The effect on the sampling errors of departures of actual density distributions from the normal form due to random variations in the material are evaluated. Numerical analyses are made of five numerical sequences: (1) 288 altitudes at 0-1 mile intervals along a grid line of a 1 in. O.S. map, (2) yields of 96 rows of potatoes, (3) 192 daily maximum screen temperature readings, (4) 192 soil temperature readings (9 a.m.) at 4 in., (5) 192 similar readings at 12 in. These analyses confirm the findings of the theoretical part of the investigation, and show that for these types of material the gain in precision with systematic sampling over stratified random sampling of the same intensity with one point per block is of the same order as the gain in precision with stratified random sampling with one point per block over stratified random sampling of the same intensity with two points per block, though the former tends to be larger in material of the more continuous type. The actual average ratios of the variances for the five sequences range from 1.26 to 2.99 in the first case, and T31 to T90 in the second. The relation between the gain in precision and the gain in efficiency is evaluated. The latter is always smaller owing to decrease in accuracy per point for a given method of sampling with decrease in intensity. Consideration of the relation between sampling costs and the losses due to errors in the sampling results shows, however, that with a more precise method of sampling greater accuracy should be demanded in the results. The danger of using systematic sampling in material about which nothing is known, or on material which may be subject to periodicities, is stressed, as is the importance in large-scale sampling investigations of making a preliminary investigation before instituting systematic sampling and of arranging for adequate control of error in the form of error estimates, with supplementary observations if necessary, in systematic sampling or stratified random sampling with one point per block. Control of this type should of course also be employed in stratified random sampling with two or more points per block, but in this case no special provisions are necessary, since valid estimates of error are always available from the sampling results themselves.
Systematic sampling
Published 2019 in Philosophical transactions of the Royal Society of London. Series A: Mathematical and physical sciences
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
Philosophical transactions of the Royal Society of London. Series A: Mathematical and physical sciences
- Publication date
2019-09-26
- Fields of study
Not labeled
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
CONCEPTS
- alternating-sum estimator
An error estimate formed by summing terms alternately positive and negative with end adjustments.
Aliases: sum of sets of terms taken alternately positive and negative
- end-correction
An adjustment applied at the ends of a systematic sample to reduce edge effects.
Aliases: end adjustments, simple end-corrections
- partial systematic sample
A systematic sample taken from a short fully enumerated section of a numerical sequence for error estimation.
Aliases: partial systematic samples
- periodicity
A repeating pattern in the sequence that can affect the reliability of systematic sampling error estimates.
Aliases: periodicities
- successive differences estimator
A conventional error estimate based on differences between adjacent sampled values.
Aliases: customary estimate based on successive differences
- supplementary sampling
Additional sampling taken at half-spacing or at half and quarter spacing to support error estimation.
Aliases: supplementary observations, supplementary samples
- systematic sampling error
The discrepancy between systematic-sample results and the values from the fully enumerated sequence.
- trend
A gradual directional change along the sequence that can distort systematic-sample estimates.
REFERENCES
- No references are available for this paper.
Showing 0-0 of 0 references · Page 1 of 1