ICES Journal of Marine Science: Journal du Conseil Advance Access originally published online on April 25, 2007
ICES Journal of Marine Science: Journal du Conseil 2007 64(5):939-944; doi:10.1093/icesjms/fsm047
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Estimation of shrimp (Pandalus borealis) carapace length by image analysis
Institute of Marine Research, PO Box 6404, N-9294 Tromsø, Norway
tel: +47 77 609731; fax: +47 77 609701; e-mail: alf.harbitz{at}imr.no
Harbitz, A. 2007. Estimation of shrimp (Pandalus borealis) carapace length by image analysis. ICES Journal of Marine Science, 64: 939944.An image analysis technique was examined to assess its ability to estimate automatically the carapace length of shrimp (Pandalus borealis). Carapace length, pixel area, and weight were measured in a sample of 285 shrimp. An accurate slide calliper was used to measure the carapace length (1330 mm) by an experienced operator with a precision (standard deviation) of
0.2 mm. A high-resolution still image camera was used to produce an 1810 x 1710 pixel colour image containing all 285 shrimp. The individual shrimp were segmented from the background by intensity thresholding. A linear model on a log-log scale of length in relation to pixel area yielded a precision of 0.43 mm. Despite differences in precision, the length frequency distributions based on manual and imaging techniques were similar. The central processing unit time spent by the image analysis program was <0.01 s per shrimp. This indicates the potential for precise, efficient, automatic processing of large numbers of shrimp lengths by, for example, video records of shrimp on a moving transport band.
Keywords: image analysis, length frequency distribution, lengthweight relationship, shrimp carapace length, slide calliper measurements
Received 27 January 2006; accepted 5 March 2007; advance access publication 25 April 2007.
| Introduction |
|---|
|
|
|---|
Length measurements of shrimp (Pandalus borealis) are essential in conducting allometric comparisons of conditions based on length and weight measurements and growth studies that follow year classes. Age classes are typically identified from modal analysis of carapace length frequency distributions, because of a lack of biological age markers (Bergström, 2000). The conventional equipment for this purpose is an accurate electronic slide calliper handled manually. This is a tedious process that typically limits the sample size to a few hundred shrimp per trawl haul. The limited sample size makes identification of certain age classes difficult (MacDonald and Pitcher, 1979), in particular those with a small relative abundance and older age classes that are not as clearly separated in length as younger age classes. Improvement in the separation of modal age classes could be obtained if a large sample of length estimates was available (e.g. based on automatic measurements from video records of shrimp on a moving transport band; Hasselblad, 1966).
The need to obtain a large sample size of precise length measurements was the motivation to investigate the ability to derive length estimates of shrimp from images based on a very simple dimensional idea. Because an image of a shrimp represents a two-dimensional projection of a three-dimensional object, the pixel area should be approximately proportional to the square of the one-dimensional carapace length, if the shape does not vary too much with carapace length. From an image analysis perspective, the challenge is to differentiate shrimp in an image from the background, as well as from each other. Because shrimp are red, they can be differentiated using a background of another colour, then applying intensity thresholding (Russ, 1992).
Two goals of the paper are to demonstrate that pixel areas of shrimp can be measured automatically and efficiently, and that they can provide precise estimates of carapace length robust with regard to the composition of sex stages. Because only one sample is examined, which contained three different sex stages, the between-sample robustness and the calibration robustness with regard to other sex stages (e.g. shrimp with external eggs) are not treated in this paper. It is emphasized that the benefit of the imaging technique is seen in a future large sample context based on, for example, video records from a moving transport band, and not as an efficient still image alternative to calliper measurements of modest sample sizes typical for use on present scientific surveys.
| Material and methods |
|---|
|
|
|---|
During a Svalbard survey in 2005, a random sample of 285 shrimp from a trawl catch was measured for carapace length to the nearest 0.01 mm using a Mitutoy IP65 electronic calliper. Shrimp lengths in the range 13.129.4 mm were measured. The slide calliper measurements were made by an experienced operator, for whom a good estimate of precision in terms of standard deviation (0.2 mm) was available after several years with precision experiments. Each shrimp was weighed to the nearest 0.1 g.
A transparent glass plate with light from below, along with a white plate, was used as background for the imaging, and five shrimp at a time were placed on the glass plate before being imaged. Each shrimp was located in its natural position. A digital high-resolution still image camera (Nicon Coolpix 995) located 50 cm above the shrimp targets was used for imaging. Images of several transparent scales, which were located at different positions in the image frame, were taken to check that the resolution (pixels per mm) was independent of direction and pixel position. Three different persons cooperated with the three measurement processes, which lasted less than an hour in total.
Image analysis
The images were imported as jpg files to a personal computer with a 1.9 GHz Intel Pentium Model 9 processor and 2 GB RAM, and processed using Matlab (The Mathworks, Inc., Natick, MA) for image analysis. For convenience and efficiency, but also to mimic a video situation with coarser resolution, the images were first down-sampled by a factor of eight in each direction. All images were then merged in a single 1810 x 1710 pixel colour image in a systematic mosaic pattern for easy identification. The pixel area for the individual shrimp in the down-sampled image varied from 367 to 1963 pixels.
The shrimp were segmented from the background based on intensity thresholding of the blue component of the image colour, because that component had the greatest contrast between the dark shrimp and the brighter background. The intensity threshold that gave the best fit between the calliper-measured carapace lengths and the image-estimated lengths was chosen. When the object pixels were identified, the next challenge was to segment and identify each shrimp individually. For this purpose, the Matlab toolbox for image analysis was applied. Objects with an area <156 pixels were eliminated.
Slide calliper precision of operator
One operator may estimate his or her precision, in terms of standard deviation, by measuring the same sample of shrimp twice. This is done by simply calculating the square root of half the variance of the differences between the two replicates. For three operators, it is sufficient that they each measure the same shrimp once, in order to estimate the precision for each of them.
Let li be the true carapace length of the ith shrimp, and let Lij be the calliper-measured length determined by operator j. The model is simply Lij = li + Eijl, Eijl
N(µj,
j),where all error terms Eijl are assumed to be independent and normally distributed, with an operator-dependent mean µj and variance
j2. The standard deviations,
j, are the precision measures we want to estimate.
Let Dijk denote the difference between two measurements of shrimp i by operators j and k:
|
|
|
|
|
| (1) |
The properties of the estimators in Equation (1) were examined by performing 10 000 bootstrap estimates of the differences Dijk (Efron and Tibshirani, 1993).
Relationships between carapace length, shrimp area, and weight
For isometric growth of a shrimp, the area and weight are expected to increase approximately as the square and cube of the length. More generally, we assume a power law between the three variables, synonymous with a linear relationship between the log-transformed variables.
Let L, A, and W denote measured length, area, and weight, respectively, for a random shrimp. Further, let X = log(L) and Y = log(A) or log(W). The measured values deviate from the linear model y = a + bx by terms
x and
y:
|
|
|
| (2) |
|
| (3) |
The properties of the b estimator, such as the standard deviation, are easily estimated by bootstrapping the n triple observations. For each bootstrap replication, bGM,boot, bGM,bootmin, and bGM,bootmax are calculated, and this is repeated nrep times. A conservative 95% confidence interval (CI) for b is then (bGM,bootmin,2.5%, bGM,bootmax,97.5%), where the notations 2.5% and 97.5% denote the 2.5 and 97.5 empirical percentiles, respectively. If the interval does not contain the isometric value for b, this is synonymous with rejection at the 5% level of the null hypothesis of b being equal to the isometric value vs. the two-sided alternative hypothesis of being different. A conservative interval in this context means that the level is expected to be >95%. This follows because, for the lower interval limit, the upper threshold on the unknown bias is chosen, which minimizes the lower interval limit, and vice versa for the upper interval limit.
Estimation of length
The focus is now how well one can estimate length from the pixel area. In this context, it is also interesting to examine how well length is estimated from weight, and to compare length estimates based on area and weight.
In all cases, the linear model y = a + bx is applied, with the following simple estimator for a:
|
| (4) |
|
| (5) |
|
| (6) |
op2, is subtracted, because in a real situation with lengths estimated from the pixel area, this measurement error is absent. The precision measure
LAcan be interpreted as the standard deviation of length measurements by pixel area as a stand-alone technique, including the inherent variation in shrimp length and area, as well as measurement errors in the area.
The exercise above can be done analogously based on weight data instead of area data, providing an estimate
LW for the precision of estimated length from weight. We can now compare the precision of length estimates based on area and weight by comparing
LA with
LW. We can also examine the correlation between the residuals LAi Li and LWi Li. A strong positive correlation will indicate that the inherent area and weight error variables covariate much more strongly with each other than with the inherent error variable for length.
The empirical length frequency distributions based on the calliper measured, image-based and weight-based lengths were visualized and compared by kernel smoothing (Wand and Jones, 1995), using a normal kernel with a standard deviation (window width) h = 0.5 mm. A length step of 0.1 mm is used in the visualization. For the image-based distribution, a frequency interval is calculated at each length step, to illustrate the maximal effect of a possible bias in the estimation of the regression slope parameter b. The interval limits are determined by the frequency values obtained with bGMmin and bGMmax, and the set of intervals is shown as a continuous band.
| Results |
|---|
|
|
|---|
The sample consisted of 177 males (sex stage 2), 53 females with head roe and abdominal spines (sex stage 4), and 55 females with head roe but no abdominal spines (sex stage 8; see Rasmussen, 1953, for more details on the sex stages). The shrimp weighed between 1.4 and 14.2 g, with a mean of 5.36 g.
An example of the blue component in the image of five shrimp at sex stage 4 (head roe and abdominal spines, Figure 1a), and the result of the automatic segmentation procedure based on intensity threshold, are shown in Figure 1. The segmented shrimp are shown in grey (Figure 1b) on a black background, along with small white segments that were excluded from the analysis. For comparison, the carapace lengths measured with the slide calliper and the corresponding lengths estimated from the grey pixel areas are given.
|
The central processing unit time spent on the image analysis was ca. 2 s, i.e. <0.01 s per shrimp, and 1000 times faster than the slide calliper measurement. This included the identification of object pixels, the segmentation of object pixels in a set of labelled and distinct objects, the calculation of object pixel areas, the separation of shrimp objects, and the identification of the shrimp based on centroid values. The latter was needed in order to link each shrimp's area with the corresponding slide calliper measurement.
Comparisons of the three operators' measurements of carapace length with a slide calliper are shown in Figure 2 based on a 30-shrimp test sample. Operator 1 (
op,1 = 0.19 mm) is the one who measured the shrimp used in the image analysis study. Note that operators 2 (
op,2 = 0.24 mm) and 3 (
op,3 = 0.14 mm) had minimal experience, illustrating that, in general, the slide calliper provides precise results. In addition, no systematic differences between the individual operators were apparent, indicating that the bias of the calliper measurements is negligible.
|
The straight one-to-one line gives a rather good fit to the image-estimated lengths plotted against the lengths measured by the slide calliper (correlation coefficient r = 0.990; Figure 3a). The image-estimated length precision is
LA = 0.433 mm, assuming
op = 0.2 mm. The estimated GM slope parameter in the regression of log(A) on log(L) was bGM = 2.171 (Table 1), somewhat above the isometric value of 2. Based on the 10 000 bootstrap simulations, the conservative 95% CI for b was (2.106, 2.234).
|
|
When the slope parameter was estimated for each sex stage separately, the results were bGM = 2.245, 2.368, and 2.115 for sex stages 2, 4, and 8, respectively. Interestingly, 95% CIs for b contained the isometric value 2 for sex stage 8, but not for sex stages 2 and 4. The lengths for each sex stage were also estimated based on the regression line fitted to that sex stage and compared with the lengths based on the entire sample (bGM = 2.171). The differences were negligible and largest for sex stage 4 (mean difference of 0.073 mm, s.d. of 0.105 mm). This indicated that the image-estimated lengths were reasonably robust with regard to the composition of sex stages, at least for the shrimp in the sample analysed.
To examine the assumption of normality for the distributions of the GM-estimators bGM, bGMmin, and bGMmax for b in the regression of log(A) on log(L), 10 000 bootstrap simulations were run, and the skewness and kurtosis values were calculated. The skewness results were 0.092, 0.038, and 0.046, and the kurtosis results were 3.017, 2.933, and 2.918 for the three estimators, respectively. These values are close to the theoretical values 0 and 3 for a normal distribution. We conclude that possible deviations from normality are so small that they have no influence on the conclusions based on the normality assumption.
When pixel area was converted to physical area, the following relationship between carapace length and projected shrimp area, A, was found, with estimated standard deviations given after the ± symbol:
|
| (7) |
The lengths estimated from weight were even more highly correlated with the calliper-measured lengths than were the image-based lengths, with an empirical correlation coefficient of rLW = 0.993 (Figure 3b). The weight-based length precision is
LW = 0.338 mm. In this case, the slope parameter bGM in the regression of log(W) on log(L) was 2.999, i.e. very close to the isometric value 3. The conservative 95% CI for bGM was (2.928, 3.069), so in a two-sided hypothesis test with a 5% test level, there was no evidence in the data to reject the hypothesis of an isometric relationship between length and weight. For the analysis on each sex stage separately, bGM was not significantly different from 3 for sex stages 2 and 4, but was significantly less than 3 for sex stage 8 on a 5% test level.
The correspondence between the image- and weight-based length estimates is shown in Figure 3c, and good correspondence is clear (r = 0.994). The correlation coefficient between residuals LA L and LW L was 0.66. When the image model underestimates or overestimates the length compared with the slide calliper, the weight model tends to move in the same direction. This also illustrates an interesting potential for estimating weight from pixel area.
Kernel smoothed length frequency distributions based on the three different length sources (slide-calliper measurements, pixel area and weight) are shown in Figure 4, based on a normal kernel with a standard deviation (window width) h = 0.5 mm. The image-based length distributions are shown as a band where, for each fine 0.1 mm step in length, the limits of the band are determined by the bias factor threshold ±(1 r) for the slope parameter bGM in the regression of log(A) on log(L). It is therefore clear that the effect of this unknown bias is negligible. All curves show reasonably good agreement with each other.
|
| Discussion |
|---|
|
|
|---|
The faster than isometric growth of pixel area as a function of carapace length (b > 2) was in contrast to the isometric lengthweight relationship (b = 3). A plausible biological explanation for this difference can be found if the growth of the shell and the rest of the shrimp is considered as two different processes. From experience (Einar Nilssen, Institute of Marine Research, Tromsø, pers. comm.), it is well known that the relative area of the dominant pleonite, which for females covers the external eggs, tends to be visibly larger in females than in males. Owing to the expanded thin shell, such an effect is expected to be relatively greater on area than on weight.
An important question is how the pixel area would vary with carapace length for shrimp with external eggs that were not present in the sample analysed. This sex stage will be given special priority in a future development of the technique. In fact, it might be expected that the relationship between length and area is less influenced by the presence of external eggs than the lengthweight relationship, because in contrast to the weight, the projected area of the pleonites covering the eggs is expected to be far less influenced by the presence or absence of the eggs. Even if the slope parameter in the linear regression between length and area, on a log-log scale, appears to depend on sex stage in general, the difference may be so modest that the image-estimated lengths will be robust, as indicated from this analysis.
Another potential application of the imaging technique that should be investigated is to identify sex stages, e.g. by colour segmentation, because head roe and eggs have colours that are clearly different from the red shrimp. In this case, the length estimates may improve as well, if it turns out that there are less between-sample differences in the lengtharea correspondence for a given sex stage, than there are between-sex stage differences. In any case, it is unrealistic to expect that the imaging technique will replace conventional manual techniques completely. For example, it is hard to see how the trans-sexual stage can be separated from males by efficient image analysis.
A future challenge will be to improve the precision of the imaging technique. One approach is to use a background colour, e.g. blue, that is different from the red colour of the shrimp, in order to enhance the separation between shrimp and the background. Another approach is to use background correction, e.g. by first taking a reference video (or photos) without shrimp, then subtracting it from the images with shrimp. One may also think of more advanced image analysis, e.g. to apply elliptical Fourier coefficients (Kuhl and Giardina, 1982) to derive some appropriate smooth shrimp contours to reduce noise.
An alternative to detecting the entire shrimp pixel area where in principle no shrimp separation might be necessary, is to focus on the eyes of the shrimp, although in this case high-resolution images would be required. Such an enhancement could be obtained with normal video resolution, though, by utilizing the fact that the same eyes are visible on several images in the video and therefore can be enhanced. Shrimp eyes should in fact be easy to segment owing to their darkness and almost circular shape, though other problems must be investigated, such as the possibly different number of eyes visible (0, 1, or 2) for each shrimp.
Although length has been the key variable considered, the imaging technique clearly has a great potential to estimate weight, not least indicated by the rather strong positive correlation between the residuals of length estimates based on weight and the residuals of length estimates based on pixel area. The imaging technique therefore has the potential to obtain reliable biomass estimates, e.g. for separate age classes identified by the analysis of the length frequency distribution provided by the imaging technique.
| Acknowledgements |
|---|
I am grateful for the help I received from several colleagues. Thomas De Lange Wenneck had the responsibility for the imaging set-up and the imaging process, and Ole Thomas Albert was responsible for the weighing process. Michaela Aschan provided relevant biological literature and references. All three, as well as Knut Sunnanå, contributed to a better paper through valuable discussions, and Ole Thomas Albert additionally through proof reading and the idea of using imaged shrimp eyes. Finally, I thank the reviewers and the editor, who contributed significantly to improving the paper after the first draft.
| References |
|---|
|
|
|---|
-
Bergström B. I. The biology of Pandalus. Advances in Marine Biology (2000) 38:55245.[Web of Science]
Efron B., Tibshirani R.J. An Introduction to the Bootstrap. (1993) London: Chapman and Hall.
Kuhl F. P., Giardina C. Elliptic Fourier features of a closed contour. Computer Graphics and Image Processing (1982) 18:236258.[CrossRef][Web of Science]
Hasselblad V. Estimation of parameters for a mixture of normal distributions. Technometrics (1966) 8:431444.[CrossRef][Web of Science]
MacDonald P. D. M., Pitcher T. J. Age-groups from size-frequency data: a versatile and efficient method of analysing distribution mixtures. Journal of the Fisheries Research Board of Canada (1979) 36:9871001.[Web of Science]
Rasmussen B. On the geographical variation in growth and sexual development of the deep sea prawn (Pandalus borealis). Report of Norwegian Fishery and Marine Investigations (1953) 10:160.
Ricker W. E. Linear regressions in fishery research. Journal of the Fisheries Research Board of Canada (1973) 30:409434.[Web of Science]
Russ J.C. The Image Processing Handbook. (1992) Boca Raton, FL: CRC Press.
Wand M. P., Jones M. C. Kernel Smoothing. (1995) London: Chapman and Hall.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







