Skip Navigation


ICES Journal of Marine Science: Journal du Conseil Advance Access originally published online on April 25, 2007
ICES Journal of Marine Science: Journal du Conseil 2007 64(5):939-944; doi:10.1093/icesjms/fsm047
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
64/5/939    most recent
fsm047v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Harbitz, A.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Harbitz, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 International Council for the Exploration of the Sea. Published by Oxford Journals. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Estimation of shrimp (Pandalus borealis) carapace length by image analysis

Alf Harbitz

Institute of Marine Research, PO Box 6404, N-9294 Tromsø, Norway

tel: +47 77 609731; fax: +47 77 609701; e-mail: alf.harbitz{at}imr.no

Harbitz, A. 2007. Estimation of shrimp (Pandalus borealis) carapace length by image analysis. — ICES Journal of Marine Science, 64: 939–944.

An image analysis technique was examined to assess its ability to estimate automatically the carapace length of shrimp (Pandalus borealis). Carapace length, pixel area, and weight were measured in a sample of 285 shrimp. An accurate slide calliper was used to measure the carapace length (13–30 mm) by an experienced operator with a precision (standard deviation) of ~0.2 mm. A high-resolution still image camera was used to produce an 1810 x 1710 pixel colour image containing all 285 shrimp. The individual shrimp were segmented from the background by intensity thresholding. A linear model on a log-log scale of length in relation to pixel area yielded a precision of 0.43 mm. Despite differences in precision, the length frequency distributions based on manual and imaging techniques were similar. The central processing unit time spent by the image analysis program was <0.01 s per shrimp. This indicates the potential for precise, efficient, automatic processing of large numbers of shrimp lengths by, for example, video records of shrimp on a moving transport band.

Keywords: image analysis, length frequency distribution, length–weight relationship, shrimp carapace length, slide calliper measurements

Received 27 January 2006; accepted 5 March 2007; advance access publication 25 April 2007.


    Introduction
 Top
 Introduction
 Material and methods
 Results
 Discussion
 References
 
Length measurements of shrimp (Pandalus borealis) are essential in conducting allometric comparisons of conditions based on length and weight measurements and growth studies that follow year classes. Age classes are typically identified from modal analysis of carapace length frequency distributions, because of a lack of biological age markers (Bergström, 2000). The conventional equipment for this purpose is an accurate electronic slide calliper handled manually. This is a tedious process that typically limits the sample size to a few hundred shrimp per trawl haul. The limited sample size makes identification of certain age classes difficult (MacDonald and Pitcher, 1979), in particular those with a small relative abundance and older age classes that are not as clearly separated in length as younger age classes. Improvement in the separation of modal age classes could be obtained if a large sample of length estimates was available (e.g. based on automatic measurements from video records of shrimp on a moving transport band; Hasselblad, 1966).

The need to obtain a large sample size of precise length measurements was the motivation to investigate the ability to derive length estimates of shrimp from images based on a very simple dimensional idea. Because an image of a shrimp represents a two-dimensional projection of a three-dimensional object, the pixel area should be approximately proportional to the square of the one-dimensional carapace length, if the shape does not vary too much with carapace length. From an image analysis perspective, the challenge is to differentiate shrimp in an image from the background, as well as from each other. Because shrimp are red, they can be differentiated using a background of another colour, then applying intensity thresholding (Russ, 1992).

Two goals of the paper are to demonstrate that pixel areas of shrimp can be measured automatically and efficiently, and that they can provide precise estimates of carapace length robust with regard to the composition of sex stages. Because only one sample is examined, which contained three different sex stages, the between-sample robustness and the calibration robustness with regard to other sex stages (e.g. shrimp with external eggs) are not treated in this paper. It is emphasized that the benefit of the imaging technique is seen in a future large sample context based on, for example, video records from a moving transport band, and not as an efficient still image alternative to calliper measurements of modest sample sizes typical for use on present scientific surveys.


    Material and methods
 Top
 Introduction
 Material and methods
 Results
 Discussion
 References
 
During a Svalbard survey in 2005, a random sample of 285 shrimp from a trawl catch was measured for carapace length to the nearest 0.01 mm using a Mitutoy IP65 electronic calliper. Shrimp lengths in the range 13.1–29.4 mm were measured. The slide calliper measurements were made by an experienced operator, for whom a good estimate of precision in terms of standard deviation (0.2 mm) was available after several years with precision experiments. Each shrimp was weighed to the nearest 0.1 g.

A transparent glass plate with light from below, along with a white plate, was used as background for the imaging, and five shrimp at a time were placed on the glass plate before being imaged. Each shrimp was located in its natural position. A digital high-resolution still image camera (Nicon Coolpix 995) located 50 cm above the shrimp targets was used for imaging. Images of several transparent scales, which were located at different positions in the image frame, were taken to check that the resolution (pixels per mm) was independent of direction and pixel position. Three different persons cooperated with the three measurement processes, which lasted less than an hour in total.

Image analysis
The images were imported as jpg files to a personal computer with a 1.9 GHz Intel Pentium Model 9 processor and 2 GB RAM, and processed using Matlab (The Mathworks, Inc., Natick, MA) for image analysis. For convenience and efficiency, but also to mimic a video situation with coarser resolution, the images were first down-sampled by a factor of eight in each direction. All images were then merged in a single 1810 x 1710 pixel colour image in a systematic mosaic pattern for easy identification. The pixel area for the individual shrimp in the down-sampled image varied from 367 to 1963 pixels.

The shrimp were segmented from the background based on intensity thresholding of the blue component of the image colour, because that component had the greatest contrast between the dark shrimp and the brighter background. The intensity threshold that gave the best fit between the calliper-measured carapace lengths and the image-estimated lengths was chosen. When the object pixels were identified, the next challenge was to segment and identify each shrimp individually. For this purpose, the Matlab toolbox for image analysis was applied. Objects with an area <156 pixels were eliminated.

Slide calliper precision of operator
One operator may estimate his or her precision, in terms of standard deviation, by measuring the same sample of shrimp twice. This is done by simply calculating the square root of half the variance of the differences between the two replicates. For three operators, it is sufficient that they each measure the same shrimp once, in order to estimate the precision for each of them.

Let li be the true carapace length of the ith shrimp, and let Lij be the calliper-measured length determined by operator j. The model is simply Lij = li + Eijl, Eijl ~ Nj, {sigma}j),where all error terms Eijl are assumed to be independent and normally distributed, with an operator-dependent mean µj and variance {sigma}j2. The standard deviations, {sigma}j, are the precision measures we want to estimate.

Let Dijk denote the difference between two measurements of shrimp i by operators j and k:


Formula

The variance of Dijk for operators j and k is estimated as:


Formula

from which we can estimate the precision of each of the three operators as


Formula 047M1

(1)
To obtain meaningful estimates, these expressions must be positive. This might not be the case when the levels of precision are very different, or when the sample size is small. In the case of one operator with a much larger variance than the other two, the problem lies in the estimation of precision for the most precise operators.

The properties of the estimators in Equation (1) were examined by performing 10 000 bootstrap estimates of the differences Dijk (Efron and Tibshirani, 1993).

Relationships between carapace length, shrimp area, and weight
For isometric growth of a shrimp, the area and weight are expected to increase approximately as the square and cube of the length. More generally, we assume a power law between the three variables, synonymous with a linear relationship between the log-transformed variables.

Let L, A, and W denote measured length, area, and weight, respectively, for a random shrimp. Further, let X = log(L) and Y = log(A) or log(W). The measured values deviate from the linear model y = a + bx by terms {Delta}x and {Delta}y:


Formula

Note that the error terms include measurement errors as well as inherent deviations from the linear model. As an estimator for b, I applied the geometric mean estimate of the functional regression of Y on X, introduced by Ricker (1973):


Formula 047M2

(2)
where SX2 and SY2 are the sample variances of X and Y, respectively. This estimator has superior bias properties compared with the classical least square estimator for b, when both X and Y are stochastic. In fact, for modest error terms, the following expressions can be deduced as modifications of the bGM estimator, which adjusts for the maximum and minimum bias that can occur:


Formula 047M3

(3)
where r is the empirical correlation coefficient between X and Y. In other words, if bGM was adjusted by the true, unknown, and unestimable bias, the adjusted bGM estimator would lie between bGMmin and bGMmax. The expressions in Equation (3) will be very useful in making inferences of isometric vs. non-isometric growth, as well as assessments of how sensitive the length frequency distribution is with regard to this unknown bias.

The properties of the b estimator, such as the standard deviation, are easily estimated by bootstrapping the n triple observations. For each bootstrap replication, bGM,boot, bGM,bootmin, and bGM,bootmax are calculated, and this is repeated nrep times. A conservative 95% confidence interval (CI) for b is then (bGM,bootmin,2.5%, bGM,bootmax,97.5%), where the notations 2.5% and 97.5% denote the 2.5 and 97.5 empirical percentiles, respectively. If the interval does not contain the isometric value for b, this is synonymous with rejection at the 5% level of the null hypothesis of b being equal to the isometric value vs. the two-sided alternative hypothesis of being different. A conservative interval in this context means that the level is expected to be >95%. This follows because, for the lower interval limit, the upper threshold on the unknown bias is chosen, which minimizes the lower interval limit, and vice versa for the upper interval limit.

Estimation of length
The focus is now how well one can estimate length from the pixel area. In this context, it is also interesting to examine how well length is estimated from weight, and to compare length estimates based on area and weight.

In all cases, the linear model y = a + bx is applied, with the following simple estimator for a:


Formula 047M4

(4)
where an overbar denotes arithmetic average. As an example, let X = log(L) and Y = log(A). The length is then estimated from A by the expression


Formula 047M5

(5)
and the residuals are simply LAiLi, i = 1, ..., n. A measure of precision of length estimated by pixel area is then


Formula 047M6

(6)
Because each of the differences in the sum above is calculated for the same individual, the unknown, true length disappears. The calliper measurement variance, Formulaop2, is subtracted, because in a real situation with lengths estimated from the pixel area, this measurement error is absent. The precision measure FormulaLAcan be interpreted as the standard deviation of length measurements by pixel area as a stand-alone technique, including the inherent variation in shrimp length and area, as well as measurement errors in the area.

The exercise above can be done analogously based on weight data instead of area data, providing an estimate FormulaLW for the precision of estimated length from weight. We can now compare the precision of length estimates based on area and weight by comparing FormulaLA with FormulaLW. We can also examine the correlation between the residuals LAiLi and LWi Li. A strong positive correlation will indicate that the inherent area and weight error variables covariate much more strongly with each other than with the inherent error variable for length.

The empirical length frequency distributions based on the calliper measured, image-based and weight-based lengths were visualized and compared by kernel smoothing (Wand and Jones, 1995), using a normal kernel with a standard deviation (window width) h = 0.5 mm. A length step of 0.1 mm is used in the visualization. For the image-based distribution, a frequency interval is calculated at each length step, to illustrate the maximal effect of a possible bias in the estimation of the regression slope parameter b. The interval limits are determined by the frequency values obtained with bGMmin and bGMmax, and the set of intervals is shown as a continuous band.


    Results
 Top
 Introduction
 Material and methods
 Results
 Discussion
 References
 
The sample consisted of 177 males (sex stage 2), 53 females with head roe and abdominal spines (sex stage 4), and 55 females with head roe but no abdominal spines (sex stage 8; see Rasmussen, 1953, for more details on the sex stages). The shrimp weighed between 1.4 and 14.2 g, with a mean of 5.36 g.

An example of the blue component in the image of five shrimp at sex stage 4 (head roe and abdominal spines, Figure 1a), and the result of the automatic segmentation procedure based on intensity threshold, are shown in Figure 1. The segmented shrimp are shown in grey (Figure 1b) on a black background, along with small white segments that were excluded from the analysis. For comparison, the carapace lengths measured with the slide calliper and the corresponding lengths estimated from the grey pixel areas are given.


Figure 1
View larger version (72K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. (a) Example of the blue component of an image with five shrimp. (b) The automatically segmented shrimp after down-sampling of the original image by a factor of 8 in each direction. The white spots in the lower panel are objects that are removed by area thresholding.

 
The central processing unit time spent on the image analysis was ca. 2 s, i.e. <0.01 s per shrimp, and 1000 times faster than the slide calliper measurement. This included the identification of object pixels, the segmentation of object pixels in a set of labelled and distinct objects, the calculation of object pixel areas, the separation of shrimp objects, and the identification of the shrimp based on centroid values. The latter was needed in order to link each shrimp's area with the corresponding slide calliper measurement.

Comparisons of the three operators' measurements of carapace length with a slide calliper are shown in Figure 2 based on a 30-shrimp test sample. Operator 1 (Formulaop,1 = 0.19 mm) is the one who measured the shrimp used in the image analysis study. Note that operators 2 (Formulaop,2 = 0.24 mm) and 3 (Formulaop,3 = 0.14 mm) had minimal experience, illustrating that, in general, the slide calliper provides precise results. In addition, no systematic differences between the individual operators were apparent, indicating that the bias of the calliper measurements is negligible.


Figure 2
View larger version (7K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. Example of an experiment to estimate operator precision of carapace length measurements. Operator 1 measured the carapace lengths used in the image analysis.

 
The straight one-to-one line gives a rather good fit to the image-estimated lengths plotted against the lengths measured by the slide calliper (correlation coefficient r = 0.990; Figure 3a). The image-estimated length precision is FormulaLA = 0.433 mm, assuming {sigma}op = 0.2 mm. The estimated GM slope parameter in the regression of log(A) on log(L) was bGM = 2.171 (Table 1), somewhat above the isometric value of 2. Based on the 10 000 bootstrap simulations, the conservative 95% CI for b was (2.106, 2.234).


Figure 3
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3. (a) Carapace length estimated from the fitted image analysis model LA = 0.872A0.460, with LA in mm and A in mm2, as a function of slide calliper measurements. (b) Model fit for the weights, LW = 11.8 W0.336, with LW in mm and W in g. (c) The correspondence between LA and LW.

 


View this table:
[in this window]
[in a new window]

 
Table 1. Values of the Ricker estimator bGM = s.d. (Y)/s.d. (X) and the estimated standard deviation (in parenthesis) of bGM, based on 10 000 bootstrap simulations of the n = 285 joint length (L), pixel area (A), and weight observations (W).

 
When the slope parameter was estimated for each sex stage separately, the results were bGM = 2.245, 2.368, and 2.115 for sex stages 2, 4, and 8, respectively. Interestingly, 95% CIs for b contained the isometric value 2 for sex stage 8, but not for sex stages 2 and 4. The lengths for each sex stage were also estimated based on the regression line fitted to that sex stage and compared with the lengths based on the entire sample (bGM = 2.171). The differences were negligible and largest for sex stage 4 (mean difference of 0.073 mm, s.d. of 0.105 mm). This indicated that the image-estimated lengths were reasonably robust with regard to the composition of sex stages, at least for the shrimp in the sample analysed.

To examine the assumption of normality for the distributions of the GM-estimators bGM, bGMmin, and bGMmax for b in the regression of log(A) on log(L), 10 000 bootstrap simulations were run, and the skewness and kurtosis values were calculated. The skewness results were 0.092, –0.038, and –0.046, and the kurtosis results were 3.017, 2.933, and 2.918 for the three estimators, respectively. These values are close to the theoretical values 0 and 3 for a normal distribution. We conclude that possible deviations from normality are so small that they have no influence on the conclusions based on the normality assumption.

When pixel area was converted to physical area, the following relationship between carapace length and projected shrimp area, A, was found, with estimated standard deviations given after the ± symbol:


Formula 047M7

(7)
where L is in mm and A is in mm2. The ± symbol in the first parenthesis is reversed compared with that in the exponent to indicate that the estimators for a and b are strongly negatively correlated (correlation coefficient equal to –0.999).

The lengths estimated from weight were even more highly correlated with the calliper-measured lengths than were the image-based lengths, with an empirical correlation coefficient of rLW = 0.993 (Figure 3b). The weight-based length precision is FormulaLW = 0.338 mm. In this case, the slope parameter bGM in the regression of log(W) on log(L) was 2.999, i.e. very close to the isometric value 3. The conservative 95% CI for bGM was (2.928, 3.069), so in a two-sided hypothesis test with a 5% test level, there was no evidence in the data to reject the hypothesis of an isometric relationship between length and weight. For the analysis on each sex stage separately, bGM was not significantly different from 3 for sex stages 2 and 4, but was significantly less than 3 for sex stage 8 on a 5% test level.

The correspondence between the image- and weight-based length estimates is shown in Figure 3c, and good correspondence is clear (r = 0.994). The correlation coefficient between residuals LAL and LWL was 0.66. When the image model underestimates or overestimates the length compared with the slide calliper, the weight model tends to move in the same direction. This also illustrates an interesting potential for estimating weight from pixel area.

Kernel smoothed length frequency distributions based on the three different length sources (slide-calliper measurements, pixel area and weight) are shown in Figure 4, based on a normal kernel with a standard deviation (window width) h = 0.5 mm. The image-based length distributions are shown as a band where, for each fine 0.1 mm step in length, the limits of the band are determined by the bias factor threshold ±(1 r) for the slope parameter bGM in the regression of log(A) on log(L). It is therefore clear that the effect of this unknown bias is negligible. All curves show reasonably good agreement with each other.


Figure 4
View larger version (14K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4. Kernel-smoothed length frequency distributions with a normal kernel with s.d. of h = 0.5 mm. The solid curve is based on calliper measurements, the dashed one on lengths estimated from weight. The shaded curve is based on lengths estimated from pixel area with the bias corrected estimators bGMmin and bGMmax, illustrating the maximum effect of the unknown bias of the b estimator.

 

    Discussion
 Top
 Introduction
 Material and methods
 Results
 Discussion
 References
 
The faster than isometric growth of pixel area as a function of carapace length (b > 2) was in contrast to the isometric length–weight relationship (b = 3). A plausible biological explanation for this difference can be found if the growth of the shell and the rest of the shrimp is considered as two different processes. From experience (Einar Nilssen, Institute of Marine Research, Tromsø, pers. comm.), it is well known that the relative area of the dominant pleonite, which for females covers the external eggs, tends to be visibly larger in females than in males. Owing to the expanded thin shell, such an effect is expected to be relatively greater on area than on weight.

An important question is how the pixel area would vary with carapace length for shrimp with external eggs that were not present in the sample analysed. This sex stage will be given special priority in a future development of the technique. In fact, it might be expected that the relationship between length and area is less influenced by the presence of external eggs than the length–weight relationship, because in contrast to the weight, the projected area of the pleonites covering the eggs is expected to be far less influenced by the presence or absence of the eggs. Even if the slope parameter in the linear regression between length and area, on a log-log scale, appears to depend on sex stage in general, the difference may be so modest that the image-estimated lengths will be robust, as indicated from this analysis.

Another potential application of the imaging technique that should be investigated is to identify sex stages, e.g. by colour segmentation, because head roe and eggs have colours that are clearly different from the red shrimp. In this case, the length estimates may improve as well, if it turns out that there are less between-sample differences in the length–area correspondence for a given sex stage, than there are between-sex stage differences. In any case, it is unrealistic to expect that the imaging technique will replace conventional manual techniques completely. For example, it is hard to see how the trans-sexual stage can be separated from males by efficient image analysis.

A future challenge will be to improve the precision of the imaging technique. One approach is to use a background colour, e.g. blue, that is different from the red colour of the shrimp, in order to enhance the separation between shrimp and the background. Another approach is to use background correction, e.g. by first taking a reference video (or photos) without shrimp, then subtracting it from the images with shrimp. One may also think of more advanced image analysis, e.g. to apply elliptical Fourier coefficients (Kuhl and Giardina, 1982) to derive some appropriate smooth shrimp contours to reduce noise.

An alternative to detecting the entire shrimp pixel area where in principle no shrimp separation might be necessary, is to focus on the eyes of the shrimp, although in this case high-resolution images would be required. Such an enhancement could be obtained with normal video resolution, though, by utilizing the fact that the same eyes are visible on several images in the video and therefore can be enhanced. Shrimp eyes should in fact be easy to segment owing to their darkness and almost circular shape, though other problems must be investigated, such as the possibly different number of eyes visible (0, 1, or 2) for each shrimp.

Although length has been the key variable considered, the imaging technique clearly has a great potential to estimate weight, not least indicated by the rather strong positive correlation between the residuals of length estimates based on weight and the residuals of length estimates based on pixel area. The imaging technique therefore has the potential to obtain reliable biomass estimates, e.g. for separate age classes identified by the analysis of the length frequency distribution provided by the imaging technique.


    Acknowledgements
 
I am grateful for the help I received from several colleagues. Thomas De Lange Wenneck had the responsibility for the imaging set-up and the imaging process, and Ole Thomas Albert was responsible for the weighing process. Michaela Aschan provided relevant biological literature and references. All three, as well as Knut Sunnanå, contributed to a better paper through valuable discussions, and Ole Thomas Albert additionally through proof reading and the idea of using imaged shrimp eyes. Finally, I thank the reviewers and the editor, who contributed significantly to improving the paper after the first draft.


    References
 Top
 Introduction
 Material and methods
 Results
 Discussion
 References
 

    Bergström B. I. The biology of Pandalus. Advances in Marine Biology (2000) 38:55–245.[Web of Science]

    Efron B., Tibshirani R.J. An Introduction to the Bootstrap. (1993) London: Chapman and Hall.

    Kuhl F. P., Giardina C. Elliptic Fourier features of a closed contour. Computer Graphics and Image Processing (1982) 18:236–258.[CrossRef][Web of Science]

    Hasselblad V. Estimation of parameters for a mixture of normal distributions. Technometrics (1966) 8:431–444.[CrossRef][Web of Science]

    MacDonald P. D. M., Pitcher T. J. Age-groups from size-frequency data: a versatile and efficient method of analysing distribution mixtures. Journal of the Fisheries Research Board of Canada (1979) 36:987–1001.[Web of Science]

    Rasmussen B. On the geographical variation in growth and sexual development of the deep sea prawn (Pandalus borealis). Report of Norwegian Fishery and Marine Investigations (1953) 10:160.

    Ricker W. E. Linear regressions in fishery research. Journal of the Fisheries Research Board of Canada (1973) 30:409–434.[Web of Science]

    Russ J.C. The Image Processing Handbook. (1992) Boca Raton, FL: CRC Press.

    Wand M. P., Jones M. C. Kernel Smoothing. (1995) London: Chapman and Hall.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
64/5/939    most recent
fsm047v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Harbitz, A.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Harbitz, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?