© 2005 International Council for the Exploration of the Sea
The fuzzy relationship between trawl and acoustic surveys in the North Sea
Centre for Environment, Fisheries and Aquaculture Science Pakefield Road, Lowestoft, Suffolk NR33 0HT, England, UK
*Correspondence to S. Mackinson: tel: +44 1502 524295; fax: +44 1502 524511. e-mail: s.mackinson{at}cefas.co.uk.
Adding information on the horizontal and vertical distribution of fish both on and between trawl stations is reason enough to perform acoustic surveys routinely in tandem with annual groundfish trawl surveys. Ideally, acoustic and trawl density indices could be combined to maximize information on fish distribution and provide more reliable estimates of stock size. The core of the problem boils down to the question: "how does what we see on an echosounder relate to what we catch in a net?" The fuzzy logic "model-free estimation" approach presented here sidesteps the need to understand specific mechanisms that determine the nature and variability of any relationship between acoustics and trawl catches. Fuzzy logic models that describe and predict the relationship linking acoustics and environmental variables (inputs) with trawl catches (output) are developed, and the sensitivities and robustness of the approach are discussed. In the models examined, the static environmental variables location and depth proved to be better predictors of trawl catches in the North Sea than the acoustic energy in the first 5 m off the bottom. We suggest that finding the "hidden" relationship between acoustics and trawls will require closer attention to partitioning the acoustics data by species/assemblages and understanding the key gear and behavioural differences responsible for producing the high between-gear variability.
Keywords: acoustic survey, fuzzy logic, groundfish, North Sea, trawl survey
Received 22 November 2004; accepted 28 June 2005.
| Introduction |
|---|
|
|
|---|
Adding information on the horizontal and vertical distribution of fish both on and between trawl stations is reason enough to perform acoustic surveys routinely in tandem with groundfish bottom-trawl surveys. It would be better still if it were possible to combine the acoustic and trawl density indices in a way that maximized information on fish distribution and allowed more reliable estimates of stock size. This has been recognized for some time (e.g. Ona et al., 1991; Godø and Wespestad, 1993; Everson et al., 1996). In their geostatistical analysis of combined acoustic and groundfish surveys from the Norwegian Sea, North Sea, and Irish Sea, Bez et al. (2002) showed clearly that the spatial structure of fish aggregations is smaller than the average distance between trawl stations. Therefore, the variance of interpolated abundance indices might be reduced if the more highly resolved between-station acoustic information could be used in calculating the index. Although combined acoustic-trawl surveys have enabled researchers to use one sampling tool to provide information on the biological/behavioural (Engås and Soldal, 1992; Michalson et al., 1996; Godø et al., 1998; Aglen et al., 1999; McQuinn et al., 1999; Hjellvik et al., 2003) and environmental (Godø and Wespestad, 1993; Aglen, 1996; Michalson et al., 1996) factors influencing the efficiency of the other (and hence reliability of independent abundance indices), there has been limited effort (Aglen, 1996; Cachera et al., 1999) to integrate both sources of data in a single index. This objective has been the focus of a European collaborative research effort, CATEFA (Combining Acoustic and Trawl surveys to Estimate Fish Abundance), for which this contribution presents an analysis of the UK North Sea groundfish data.
Simply stated, the problem boils down to the question: "how does what we see on an echosounder relate to what we catch in a net?" If we are able to establish a relationship between the acoustic measurements and the catches at trawl stations, we can use it to make predictions of the trawl catches between stations. By adopting such a model-based prediction scheme to "fill in the gaps" (Figure 1), the amount of interpolation needed to estimate the stock size over the whole survey area is reduced, and consequently the variance of the survey estimate should be lower. Before tackling what may seem to be an apparently straightforward task, there are three rather knotty issues that need to be recognized and, where possible, taken into consideration.
|
First, it is not possible to be certain of the identity of the fish represented by acoustic backscattering. Second, as the strength of acoustic backscatter is mostly dependent on the size of the swimbladder (strictly speaking the area insonified), it is probably not appropriate to compare the acoustic backscatter directly with trawl catch numbers or weight. Ideally, the trawl catches need to be represented by an appropriate acoustic equivalent. Third, at any particular location and time, the relationship between acoustics and trawl catches will depend on the species composition and their size and behaviour, as well as the physical properties that influence the performance of the acoustics and trawl gear (including factors such as seabed topography, substratum, water depth, and tide).
This communication presents the methods and results of fuzzy logic models (Zadeh, 1973) that link acoustics and environmental variables with trawl data, in an attempt to provide an improved abundance index for groundfish. The analysis presented here sidesteps the need to understand the specific mechanisms that determine the nature and variability of any relationship between acoustics and trawl catches (i.e. the third point above) by taking a data-driven approach. The analysis searches for patterns in the relationship between acoustic backscatter and trawl catches. It uses explanatory variables that give the best predictions of the trawl catch without providing an explanation of the underlying mechanisms.
The "model-free estimation" approach (Kosko, 1993a) adheres to the principle of fuzzy approximation theory (Kosko, 1993b; Kasabov, 1996), which simply stated demonstrates that any continuous curve (function) can be approximated to any degree of accuracy by covering it with discrete patches. An individual patch equates directly to a single fuzzy rule of the form IF this THEN that. A bunch of patches (or rules) combine to produce a fuzzy system that describes the relationships between input variables and output variables, where one or more of the variables is represented using fuzzy membership functions (sets). The set of rules (model) is used to make inferences about new data. For more details and an example, see Mackinson et al. (1999).
Searches for clear linear relationships between environmental data are generally rather unsatisfying; more often than not a shotgun scatter of data points is observed. Such scattered data are frequently typified by patches of data points that cannot be described by simple equations (see, for example, fish stock-recruitment relationships; Myers et al., 1995). Clustering of algorithms provides a way to search for patches in data that can be used to show how similar inputs associate with similar outputs. The key concept is: Cluster = Patch = Rule. By identifying the clusters, it is possible to construct the fuzzy IFTHEN rules that describe the connection between input and output variables and to build fuzzy models that approximate the continuous functional form of the unknown relationship (Höppner et al., 1999). The ability to quantify the mixing or overlap between clusters makes fuzzy clustering a particularly attractive method to use on environmental data, where often we want to describe and quantify the spatial relationships within data (Equihua, 1990).
| Methods |
|---|
|
|
|---|
Data collection and preparation
Trawl catch, acoustics, and environmental (CTD) data were collected from three International Bottom Trawl Surveys (IBTS) carried out during August of 2000, 2001, and 2002 in the North Sea on board RV "Cirolana".
Acoustics data were collected using a SIMRAD EK500 scientific echosounder with a hull-mounted 38-kHz split-beam transducer, and post-processed at 70 dB using Sonardata Echoview®. Quality control procedures were employed to ensure that data considered spurious or unfit for processing as a consequence of poor weather conditions, vessel turning, interference from other instruments, inclusion of bottom structures, and excessive surface aeration, were excluded. Further, to ensure that the acoustic marks typical of large aggregations of pelagic fish (herring and sprat) did not swamp the signal from demersal fish, those distinctive high-intensity acoustic marks were filtered out using knowledge of their acoustic characteristics, schooling behaviour, distribution, and composition in the trawl catch. Using an elementary distance-sampling unit (EDSU) of 0.5 nautical miles, data were subdivided vertically into discrete depth layers (Figure 2) and horizontally into "on-station" sections recorded during trawling and "between-station" sections recorded when travelling between trawl stations. The processed data provide a value of nautical area scattering coefficient (NASC m2 nautical mile2; MacLennan et al., 2002) for each individual layer within each EDSU. Note that for the on-station data, the NASC is calculated as the weighted average NASC of the EDSUs representing the length of the trawl tow (i.e. pro rata EDSUs covered by the trawl).
|
Trawls were performed using a Grand Overture Vertical, with operating procedures adhering to the guidelines specified by the IBTS programme (Anon., 1999). In addition to catch numbers, catch weight, mean fish weight, and length frequencies for all species, the depth and near-bottom-water temperature were recorded at each of the 75 trawl stations. The trawl catch of each species at each station was represented in the same units as the acoustics data by converting the trawl catch numbers into equivalent acoustic energy, the "Equivalent NASC" (ENASC), in several steps. First, target strength TS (dB) was calculated:
|
| (1) |
The derived target strength was used to calculate the expected acoustic spherical scattering cross-section
sp (m2), i.e. the acoustic energy from an individual fish:
|
| (2) |
Multiplying the spherical scattering cross-section by the number of fish caught gave the area scattering cross-section of the trawl catch, CatchNASC (m2):
|
| (3) |
Finally, the trawl catch area scattering was divided by the swept area A (tow length x door spread, nautical mile2) to obtain ENASC (m2 nautical mile2):
|
| (4) |
The models presented here focus primarily on establishing a relationship between the acoustic energy in the depth zone equal to the fishing height of the trawl (5 m) and the corresponding catches of the top three demersal species: haddock, whiting, and Norway pout. These are labelled the "Default Models". In addition to the default models, environmental variables, namely temperature, depth, latitude, and longitude, were used as input variables for modelling the trawl catches. Acoustics (NASC) and trawl data (ENASC) were log-transformed to help separate tightly clustered data and reduce the effects of extreme values during modelling.
Modelling and predictions
Applying the "model-free estimation" approach we developed fuzzy models that describe and predict the relationship linking acoustics and other environmental variables (inputs) with trawl catches (output). The method consisted of two main stages, outlined in Figure 3, and briefly described below.
|
Defining the fuzzy system using "on-station" data
Details of the general principles and methods are provided in Mackinson et al. (1999). The following description of the methods focuses on two new methodological innovations, representing the memberships of data points to all clusters, and combination of multiple input variables. To aid the description, the treatment of a single "data instance" (Haul 62) is followed as an example.
Fuzzy clustering (Appendix) was used to search for data patches defining the relationship between acoustics and environmental input variables with the trawl output variables. These were translated into a series of fuzzy IF...THEN rules, whose confidence values were determined by the degree to which each data point belonged to each cluster, as described below.
Taking the example of Haul 62 in Figure 4, the confidence factor (CF) for each rule is calculated by multiplying the membership degree of each of the elements, IF part and THEN part, viz.:
- Rule 1: IF Acoustics is x_cluster 6 (0.88) THEN Trawl is y_cluster 8 (0.7) CF = 0.88 x 0.7 = 0.62
- Rule 2: IF Acoustics is x_cluster 7 (0.12) THEN Trawl is y_cluster 8 (0.7) CF = 0.08
- Rule 3: IF Acoustics is x_cluster 6 (0.88) THEN Trawl is y_cluster 9 (0.3) CF = 0.26
- Rule 4: IF Acoustics is x_cluster 7 (0.12) THEN Trawl is y_cluster 9 (0.3) CF = 0.04
- Rule 2: IF Acoustics is x_cluster 7 (0.12) THEN Trawl is y_cluster 8 (0.7) CF = 0.08
|
By describing each data instance with four rules, this new method provides a more accurate representation of the data than the method applied in Mackinson et al. (1999), where a data instance was represented by the single rule whose confidence was highest. Next, by aggregating all rules generated from each data instance, a single set of rules is defined (Table 1). The complete set of extracted rules is the "rule set" defining the relationship between a single input and output variable.
|
Rule sets that have different input variables but the same output variable can be combined in one fuzzy model. In theory, there can be any number of rule sets that use different input variables to contribute information to the same output variable. Each rule set is independent, but together they form additive contributors to the final modelled prediction of the output variable. The resultant value of a predicted output variable depends on how the conclusions made by each individual rule are combined, and in particular on the number of contributing rules and their strength of influence exerted through their associated confidence factors.
For two rule sets, the weighting assigned to each is achieved by multiplying the confidence associated with each rule by a weight, w [0 < w < 1, where the weight on rule set 2 (w2) is 1 minus the weight on rule set 1 (w1)], that represents our degree of belief in how important each of the input variables are in determining the output variable. Applying this method of re-weighting the rules, we can test the influence of the contribution of each input variable to the output prediction by changing w from 1 to 0, plotting the modelled predictions with the observed data, and comparing the sum of squared differences. The plot of sum of squared differences for each value of w shows which values of w (or which "model") is the best fit.
With three or more rule sets (i.e. more than two input variables) to model, we are faced with a dilemma of how to weight the rules sets. The method of using w and 1 w does not apply. Ideally, we should examine all possible combinations of w that can be applied to the input variables. For three input variables, there are 66 combinations (264 for four, and 1320 for five input variables). Instead of examining all the possible combinations, we propose using an alternative approach that sequentially builds on the analyses already performed in the double rule set. We call the method "separable step weighting", whereby:
- The rule sets for the first two input variables are evaluated for the weight factors w1 and w2 that produce the smallest residual sum of squares (best model fit to the data).1
- The third input variable rule set is included and is weighted according to w3 = 1 (w1 + w2)k, where w1 and w2 are fixed, and k is varied between 0 and 1. The k leading to the smallest residual sums of squares is selected and the corresponding value for the weight factor w3 is calculated.
Implementing fuzzy models
The fuzzy rule-based models of on-station acoustics and environmental data with trawl catch were implemented in a fuzzy system freeware program, FULSOME (MacDonell et al., 1999). During the reasoning processes (often referred to as inference), maximum implication and sum aggregation methods were employed, with a centroid de-fuzzifier applied to derive a crisp (non-fuzzy) value for the predicted on-station trawl catch (see Mackinson in the report on the CATEFA Website2 for a full description of the methods and their influence). The alpha cut threshold and rule degree threshold were both set to zero, so that all rules fire and contribute.
Evaluating and selecting the best models
Alternative single input variable models were evaluated, and the best was selected according to a criterion of the minimum squared residual difference between the model-predicted and the observed on-station trawl catches. Where two or three input variables were combined in one model, the rule weighting system described above enabled us to test the influence of the contribution of each input variable to the output prediction. Residuals (from log-transformed base models) were tested for normality. The histograms of the residuals displayed a normal distribution curve; plotting predicted against the absolute values of the residuals showed random distribution and the normal probability plots also obeyed normality restrictions.
Predicting "between-station" trawl catches: the new combined survey index
Using the observed between-station acoustics and/or environmental variables as input to the fuzzy models, point estimates of the between-station trawl catches were predicted (Figure 3 right panel).
These were combined with the observed on-station trawl catches and subsequently interpolated over the entire survey area. Spatial interpolation was performed using kriging after variogram-modelling of the spatial structure of the point estimates. The volume underneath the interpolated surface represents the new combined survey index. Using the default and best on-station models, we evaluated eight alternative predictions of the combined survey index and compared them with an interpolated index derived from the observed on-station trawl catch alone.
Model sensitivity tests
Tests were undertaken to evaluate model performance, sensitivity, and robustness. Specifically, the tests addressed three key questions:
- How many clusters are needed to capture the behaviour of the data?
- What effect does log-transforming the data have on model performance?
- Is the separable step weighting method an adequate approximation of the full analysis when combining multiple rule sets? Do we learn anything more using the "full" method?
For continuity and clarity, the same example data sets are followed through the evaluations, starting with single rule examples and developing through multiple rule set tests.
| Results |
|---|
|
|
|---|
Fuzzy models
As a guide for readers through the results, Figure 5 provides a diagrammatic route-map of the key results presented. Input variables are used to predict catches of individual species (haddock, whiting, and Norway pout) as well as their combined catch (demersal fish).
|
The default single variable models, using acoustics NASC in the bottom 5 m to predict trawl catch, were outperformed by other combinations of variables in both the demersal and single-species cases (Table 2). A model using longitude was the best predictor of demersal trawl catch (Figure 6a), whereas depth was best predictor in the case of whiting (Figure 6c). For comparison, the default single variable models using acoustics NASC as input are given in Figure 6b and d. Depth was also found to be the best predictor of Norway pout and haddock catch, and second best for total demersal.
|
|
We examined whether a combination of both acoustics and longitude or depth provided a better prediction of trawl catches by combining the two rule sets of the default model and best-fit models into one "double rule set" model. After weighting each of the individual rule sets, the impact of each input variable to the output prediction was tested by changing the weight from 1 to 0, plotting the modelled predictions with the observed data, and comparing the sum of squared differences. In both double rule set models (Figure 7), reducing weight on acoustics, which increases weight on longitude and depth, caused an increased vertical spread of the predicted data and minimized the difference between the predicted and observed trawl data (see bottom panel). Plainly stated, the inclusion of acoustics input variables did not produce improved results; longitude and depth were again the better predictors of demersal and whiting trawl catches, respectively, than the acoustic variable. The shape of the residual graphs (bottom panels) clearly displays the effect of varying the weights of different input variables.
|
Adding a third input variable, we created triple rule set models for both demersal species and whiting. Depth was added as the additional input in the demersal model, and longitude in the whiting model. All 66 possible rule set weighting combinations were examined, but again, the best fit between observed and predicted trawl catches was achieved with zero weight on the acoustic input variable. For the demersal, the best-fit weight combination was w = 0 0.4 0.6 (log 05m NASC longitude depth; Figure 8) and for whiting w = 0 0.9 0.1 (log 05m NASC depth longitude; Figure 9). Best triple input variable weight combination for haddock was w = 0 1 0 (log 05m NASC depth longitude), i.e. all weight on the depth variable, and for Norway pout w = 0 0.9 0.1 (log 05m NASC depth latitude).
|
|
New abundance indices from combined data
The overall distribution pattern of demersal fish shown in Figure 10a reflects the combined influence of longitude and depth variables (no weight given to acoustics) predicted by the best triple rule demersal fish model. The pattern predicted by on-station trawl catches (Figure 10b) is clearly driven by the high densities at particular stations. It is not possible to say which is the better predictor of the true distribution of demersal fish, but it is clear (if unsurprising) that the combined index (using longitude and depth as predictors) has a considerably lower standard deviation (see equivalent lower panels). In respect of the overall abundance index, the estimates obtained from models combining explanatory variables for demersal fish and whiting were three and four times lower than the trawl index based on observed catches alone (Table 3). Reasons for this are suggested later.
|
|
Sensitivity testing
Number of clusters and the robustness of model predictions
While more clusters generally produce better fitting models (Figure 11), the general predictions of the models are robust over a wide range of cluster numbers. This is evidenced by the narrow range of residual values that result from alternative models constructed using different numbers of clusters (Table 2).
|
In Figure 11, the optimum cluster combination according to the lowest separation validity (Xie and Beni, 1991) is 11 and 10 on the x and y axes, respectively, while the actual lowest calculated residual sum is 13 and 10 clusters, respectively. The lowest separation validity provides the optimum cluster number according to a user-specified error criterion, but it can result in local rather than global minima. In our analysis, we frequently explored up to the maximum number of clusters that could be achieved within the program (20 clusters). We employed pragmatism and parsimony in our decision regarding the optimum number, selecting a low value of separation validity with a high number of clusters.
What effect does transforming the data have on model performance?
Raw acoustics and trawl data are typically characterized by large numbers of low values and a few very high ones. Clustering raw data results in few clusters for describing the mass of data, but a good definition of the distinct extremes. Models of raw data are "pulled" by the extreme values, resulting in the mass of data being overestimated (Figure 12). By log-transforming the data, the ability to define clusters describing the mass of data points improves dramatically. Cluster centres are more regularly spread, with both the mass and extremes well defined. However, log-models are more conservative in their predictions, tending to track the mass of data better, but underestimating extreme values (Figure 12).
|
Is the separable step weighting method an adequate approximation of the full analysis when combining multiple rule sets?
In the triple rule models presented here, the full 66 possible weight combinations were tested in the search for the combination providing the best fuzzy approximation. The more efficient separable step method proposed in the methods successfully identifies the best-fit model. As such, it is a pragmatic time-saving solution to the analysis that provided us with the option of exploring a broader range of input variables.
| Discussion |
|---|
|
|
|---|
The main objective of the analyses presented here was to develop and apply an appropriate methodology for combining acoustics and trawl data from the North Sea international bottom-trawl survey into a single abundance index. The work was driven by the premise that the inclusion of simultaneously collected acoustics survey data, with a more-resolved sampling structure, could potentially improve the precision and accuracy of the abundance index used in the stock assessment of commercial groundfish.
Fuzzy models of the relationships between acoustics and environmental variables with the fish catches at each trawl sampling station were constructed as a means of combining the data sources and to predict fish catches between trawl stations. In all models tested, depth and longitude were better predictors of fish catches than the acoustic backscatter at 05 m. This included cases where multiple variables were used as predictors of trawl catches, increased weighting applied to the acoustic input variable serving only to degrade the model performance. The same result was found by Neville et al. (2004), who used artificial neural networks to predict trawl catches from acoustics data. As the acoustics seemed to be such a poor predictor of North Sea trawl catches, little faith can be put in the capacity of a derived acoustic-trawl combined index to improve the accuracy of stock assessments. Therefore, why were the acoustics such a poor predictor and how could any problems be resolved? Below, we discuss issues relating to the methods and, with the benefit of hindsight, consider factors that influenced the success of the "data-driven" approach guiding the analyses.
The underlying principle of the fuzzy modelling used in this analysis is to "let the data tell you", and as such the models behaved like a weighted average smoother through the clusters inherent in the data. In doing so, they successfully captured, or described, the behaviour of the data, as has been shown in other applications (Mackinson et al., 1999). The models were not sensitive to the number of clusters used to define rules; the more clusters used, the more accurately the data were described. Moreover, the models were robust in respect of using the separable step weighting method and alternative inference options used during prediction (Mackinson et al., 2004).
A key difficulty with modelling the acoustics data is that they are highly skewed, with the occurrence of many small values and a few very large ones. The large values are produced by dense fish aggregations and cannot be treated as outliers, because they often account for a large proportion of the total stock. The problem of skewness is a common feature of fisheries (acoustics) data that creates a problem for any analyses. It is not a problem unique to the fuzzy methods used here. Typically it is dealt with by performing a log-transformation. The consequence of this for the fuzzy models constructed with log-transformed data is that they describe well the mass of data, but are poor at predicting the high values, i.e. they tend to be conservative in their predictions. By contrast, fuzzy models constructed using raw data are better able to represent the high values, but consistently overestimate the mass of data, giving overestimates of trawl catch (see Figure 12). In considering how we might try to address this problem in further investigations, we have come up with three possible suggestions: (i) treat the high values differently by extracting them using a percentile approach and modelling the mass of data separately, (ii) assert cluster centres by eye, and in doing so make the clusters for large values very specific and with narrow widths, (iii) log-transform the data and define cluster centres on the spread data, then back-transform the cluster centres and subsequently calculate the membership of each raw data point to each of the back-transformed cluster centres.
The founding assumption that some type of relationship between acoustics and trawl data would emerge from thorough analysis of large numbers of data seems perhaps not to have served us well. Even when the acoustics data were partitioned into detailed vertical layers and compared with both individual species and aggregate fish assemblages, no obvious patterns emerged in the data. Godø and Wespestad (1993) conjectured that a synthesis of acoustics and trawls might not be readily achievable, because each gear samples a different fraction of the stock and has a different efficiency. Their analysis showed that between-year differences in the stock composition, abundance, and distribution were important in determining the fraction of the stock available to each gear. Our preliminary analyses that included year as an input variable in the fuzzy models did not show any relationship that could be meaningfully included in the models. Godø and Wespestad (1993) further commented that the problems connected with the synthesis of acoustics and trawl data can only be overcome if the availability to the two observation methods was known and could be corrected for. As observed by many authors referred to in the introduction, many factors contribute to the variability in the acoustics and trawl indices. Of primary research importance is deducing which factors are the most important to take into consideration.
Differences in the efficiency of acoustics and trawl sampling tools undoubtedly play a large part in accounting for the variability masking any obvious relationships between the two. An interesting feature of the results from our analysis is that depth was the best predictor of trawl catch in all the individual species (and second best for demersal) models. Although many species do indeed show strong depth preferences, depth-dependence may be an artefact of gear selection determined by the ratio between warp length and bottom depth; a critically important factor influencing the performance of trawls. Perhaps our models are simply reflecting gear effects rather than true species depth effects? Unfortunately warp length was not included as an input variable in the analysis, so it is not possible to differentiate. The alignment of the trawl relative to the position of the transducer, and the relative sizes of the acoustic beam footprint and trawl sample area are other gear-related factors that are likely to mask the relationship between trawls and acoustics.
With regard to the acoustic sampling, the chief perennial knotty problem is identification of acoustic targets. If it were possible to partition the acoustic backscatter easily and to allocate the proportions to individual species for comparison with the trawl catch by species, the strength in the relationship between acoustics and trawls would likely be much clearer. Unfortunately, this is a non-trivial, time-consuming task. Beare et al. (2004) used the straightforward approach of applying the proportion of the known species caught in the trawl to partition the acoustics index to species. Doing so improved correlations between trawl and acoustics data, partly because it provided positive acoustic values for a species even when none of that species was observed on the echosounder. This basic approach of partitioning the acoustic backscatter according to the proportion of each species caught in the nearest trawl haul is not wholly sufficient, because it simply asserts a proportional relationship. It is not an emergent pattern, producing results that reflect the asserted proportionality. A more rigorous and correct way to partition to the acoustics data is to have experts judge the acoustic traces and partition the acoustic backscatter according to a suite of information. That information includes factors such as expert knowledge of species behaviour, distribution, acoustic property, and information from the trawl catch. The results of CATEFA have prompted such a preliminary detailed analysis (Godø et al., 2004), and the output indicates that the scrutiny process improves correlation between trawl and acoustics. Accurate automated identification and classification of species from acoustic recordings has beguiled acousticians for a long time. It remains the holy grail of acoustics, and despite some recent developments (Hammond and Swartzman, 2001; Reid, 2002; Petitgas et al., 2003; and the presentations and discussion that took place at the ICES Annual Science Conference, 2004, Theme session R: New developments in fisheries acoustics: applications to multi-frequency species identification), it is fair to say that, for the foreseeable future, the processing of acoustics data will still require subjective allocation, much of which depends on the skills and experience of the scientist.
An almost equally vexing and persistent problem for acoustics is quantifying the unrecorded fish in the bottom dead zone (Ona and Mitson, 1996; Aglen et al., 1999). The extent of the bottom dead zone is a function of pulse length, transducer beam width, depth, and bottom configuration (MacLennan and Simmonds, 1992). When the bottom is sloping or rough, or when the vessel is pitching and rolling, the dead-zone area will increase. Differences in fish behaviour between day and night (Aglen et al., 1999; McQuinn et al., 1999), and with stock size (Godø and Wespestad, 1993), affect the proportion of the stock "hidden" in the dead zone. By extrapolating into the dead zone, McQuinn et al. (1999) applied a correction to acoustics data that averaged 811% of the biomass of cod (Gadus morhua) at night and 3034% by day. Aglen et al. (1999) showed that differences in the behaviour of large and small fish further compound the problem. In the analyses presented here, considerable effort was made to minimize the dead-zone effect. Acoustics data were carefully scrutinized to prevent including bottom echoes as fish, and the zone above the bottom ignored in the data (the so-called backstep) varied from 0.1 m to 0.5 m, depending on the quality of the acoustics data. When the echoes down to the echosounder-detected bottom were included in the analyses (no backstep), the variability in the acoustics data was so high that it prevented meaningful analysis. Greig and Reid (2004) studied the issue in greater detail by re-examining the backstep to determine if there was anything useful in the normally discarded data that would meaningfully relate to trawl catches. The results presented weak correlations, suggesting that any information present in the backstep is of little use in relating acoustics and trawl data.
Combined physiological and behavioural differences in the acoustic properties and an ability to evade capture strongly influence the relative trawl and acoustics "detectability" of different sizes and species of fish (Michalson et al., 1996; Fréon and Misund, 1999). Vessel avoidance and herding effects in the time passing between the echosounder recordings and fish being caught have considerable influence on both the acoustic recording and trawl catches (see e.g. Ona and Godø, 1990). Although vertical herding effects may increase the effective fishing height of a trawl, justification for using acoustic data equivalent to the headline height of the trawl (5 m) comes from preliminary analyses that show that most of the acoustic signal for demersal fish is given in the bottom 5 m, suggesting that any possible relationships would be found there. Recent effort to try and better quantify the herding effect has been undertaken with an upward-looking transducer mounted on the net (Michalsen et al., 1999). The net-mounted transducer recorded half as much as the hull-mounted transducer, but it is not clear if this was due to lateral avoidance or changes in acoustic backscatter resulting from changes in the tilt angle of the fish influencing the target strength.
Our analysis of the North Sea groundfish survey data has shown that variables other than acoustics were better predictors of the trawl catches. The data-driven approach implicitly assumed stability in the biases associated with acoustics and trawl data, one that may not hold true. The comparison of data sets indicates that the availability and the errors associated with the trawl and acoustic methods of density assessment are very different, making it difficult to establish any clear relationship between concurrent trawl and acoustic samples. However, we believe that it remains reasonable to assume that a relationship between acoustics and trawls does exist, particularly as more promising results are provided by analyses of Barents Sea data conducted during CATEFA (e.g. Beare et al., 2004; Bez et al., 2004; Bouleau et al., 2004; Neville et al., 2004). In the case of the North Sea, finding the "hidden" relationship will require closer attention to partitioning the acoustic data by species/assemblage, and understanding the key gear and behavioural differences responsible for producing the high between-gear variability.
| Appendix A |
|---|
|
|
|---|
Searching for clusters and defining memberships
The fuzzy clustering routine (c-means algorithm Dunn, 1974; Bezdek, 1981, 1987) defines the optimum patches or clusters to describe the data according to an iterative scheme that minimizes an objective function. It provides the cluster centroid position (supremum), width (support) of the membership functions, and degree of membership of each point to any cluster (Figure A1). By modification of the algorithm given in Bezdek (1981), it was possible to define asymmetric membership functions. This element is central for providing a description of the data with the minimum possible number of sets. A detailed description of the algorithm is provided in Mackinson et al. (2004). Two conditions implemented in the fuzzy clustering software are worth noting here. First, for a given value of each fuzzified input variable, the sum of the degrees of belief to each membership function = 1; a condition not strictly required according to Fuzzy Set Theory (Zadeh, 1965, 1973), but nonetheless frequently applied in the development of fuzzy systems, because experience has shown that such conditions allow users to develop logical systems quickly and that work in practice (McNeill and Freiberger, 1993; Kosko, 1993a), and are more easily interpreted because of the similarity to probabilities. Second, for the non-end sets, membership functions were specified as triangles, with the base of each triangle (support of the fuzzy sets) extending to the supremum of the adjoining set, such that the value of the input variable for which the membership to a given set is 1 (maximum) is the same as the value for which the membership to the adjacent set is 0 (minimum), and vice versa. Drawn like this, the sets have a high fuzziness, explicitly accounting for uncertainty in the input variables. Trapezoid shape membership functions were used for the inner and outermost sets, preventing the model extrapolating beyond the limits of the observed data.
|
| Acknowledgements |
|---|
We gratefully acknowledge the support of the crew of RV "Cirolana" and the groundfish research team, as well as CATEFA partners for critical discussions and clarifications on the methods used in the analysis. The work was sponsored by the EU Framework 5 programme and UK Department of Environment, Food and Rural Affairs under contract MZ902.
During our analysis of the data, we developed bespoke software that performs fuzzy cluster analysis using the c-means algorithm and subsequently fuzzifies and outputs the data in a format that can be used by a fuzzy system development freeware program, FULSOME (MacDonell et al., 1999). Both the clustering software and FULSOME are freely available upon request from the first author (s.mackinson{at}cefas.co.uk), together with detailed protocols and data analysis templates facilitating the development of fuzzy models.
| Footnotes |
|---|
1 When there is a selection of input variable to choose from, the first two input variables are the pair with the smallest residual sum of squares.
2 http://www.cg.ensmp.fr/~bez/catefa/. ![]()
| References |
|---|
|
|
|---|
-
Aglen A. (1996) Impact of fish distribution and species composition on the relationship between acoustic and swept-area estimates of fish density. ICES Journal of Marine Science 53:501505.
Aglen A., Engås A., Huse I., Michalsen K., Stensholt B. (1999) How vertical fish distribution may affect survey results. ICES Journal of Marine Science 56:345360.
Anon. (1999) Manual for the International Bottom Trawl Surveys, Revision VI. ICES Document, CM 1999/D: 2, Addendum 2.
Beare D. J., Reid D. G., Greig T., Bez N., Hjellvik V., Godø O. R., Bouleau M., van der Kooij J., Neville S., Mackinson S. (2004) Positive relationships between bottom trawl and acoustic data. ICES Document, CM 2004/R: 24. 15 pp.
Bez N., Bouleau M., Godø O. R., Armstrong M. J., Gerritsen H., Vérin Y., Massé J., Méhault S. (2002) Comparison between "underway" and "on station" acoustic measurements made during bottom trawl surveys. ICES Document, CM 2002/J: 03. 17 pp.
Bez N., Reid D. R., Bouleau M., Beare D., Neville S., Vérin Y., Godø O. R., Gerritsen H. (2004) Insight on fish reaction to the presence of trawl from the comparison of acoustic data recorded during and between trawl stations. ICES Document, CM 2004/R: 14. 28 pp.
Bezdek J.C. (1981) Pattern Recognition with Fuzzy Objective Function Algorithms. Advanced Applications in Pattern Recognition(Plenum Press, New York) 256 pp.
Bezdek J.C. (1987) Some non-standard clustering algorithms. In Legendre P. and Legendre L. (Eds.). Developments in Numerical Ecology(Springer, Berlin) vol. G14: pp. 225287 NATO ASI Series.
Bouleau M., Bez N., Reid D. G., Godø O. R., Gerritsen H. (2004) Testing various geostatistical models to combine bottom trawl catches and acoustic data. ICES Document, CM 2004/R: 28. 19 pp.
Cachera S., Massé J., Vérin Y. (1999) How the use of acoustics during bottom trawl surveys may provide more accurate abundance indices: an application to IBTS surveys carried out in the southern North Sea. ICES Document, CM 1999/J: 12. 12 pp.
CATEFA. website http://www.cg.ensmp.fr/
bez/catefa/.
Dunn J.C. (1974) A fuzzy relative of ISODATA process and its use in detecting, compact separated clusters. Journal of Cybernetics 3:3257 (Reprinted in Bedzek, J. C. (ed.) Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. Institute of Electrical and Electronics Engineers, New York, 1992. 539 pp.).
Engås A. and Soldal A.V. (1992) Diurnal variation in bottom trawl catches of cod and haddock and their influence on abundance indices. ICES Journal of Marine Science 49:8991.
Equihua M. (1990) Fuzzy clustering of ecological data. Journal of Ecology 78:519534.[CrossRef][Web of Science]
Everson I., Bravington M., Goss C. (1996) A combined acoustic and trawl survey for efficiently estimating fish abundance. Fisheries Research 26:7591.[CrossRef][Web of Science]
Fréon P. and Misund O.A. (1999) Dynamics of Pelagic Fish Distribution and Behaviour: Effects on Fisheries and Stock Assessment(Fishing News Books, Oxford) 360 pp.
Godø O. R., Hjellvik V., Greig A., Beare D. (2004) Can subjective evaluation of echograms improve correlation between bottom trawl and acoustic densities? ICES Document, CM 2004/R: 23. 11 pp.
Godø O.R., Karp W.A., Totland A. (1998) Effects of trawl sampling variability on precision of acoustic abundance estimates of gadoids from the Barents Sea and the Gulf of Alaska. ICES Journal of Marine Science 55:8694.
Godø O.R. and Wespestad V. (1993) Monitoring changes in abundance of gadoids with varying availability to surveys. ICES Journal of Marine Science 50:3951.
Greig T. and Reid D. (2004) Fish in the back step: can we be missing a signal from within this layer? ICES Document, CM 2004/R: 17. 7 pp.
Hammond T.R. and Swartzman G.L. (2001) A general procedure for estimating the composition of fish school clusters using standard acoustic survey data. ICES Journal of Marine Science 58:11151132.
Hjellvik V., Michalsen K., Aglen A., Nakken O. (2003) An attempt at estimating the effective fishing height of the bottom trawl using acoustic survey recordings. ICES Journal of Marine Science 60:967979.
Höppner F., Klawonn F., Kruse R., Runkler T. (1999) Fuzzy Cluster Analysis(Wiley, Chichester) 289 pp.
Kasabov N.K. (1996) Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering. A Bradford Book(MIT Press, Cambridge, MA) 550 pp.
Kosko B. (1993a) Fuzzy systems as universal approximators. IEEE transactions on computers, 1993. Proceedings of the 1992 IEEE Conference on Fuzzy Systems (FUZZ-92)San Diego, March 1992 pp. pp. 11531162.
Kosko B. (1993b) Fuzzy Thinking: the New Science of Fuzzy Logic(Hyperion, New York) 318 pp.
MacDonell S., Gray G., Kilgour R., Calvert J. (1999) Fuzzy Logic for Software Metrics (FULSOME v1.01). Produced by Software Metrics Research Laboratory, Department of Information Science. The University of Otago © 19981999. Contact: stephen.macdonell{at}aut.ac.nz.
Mackinson S., van der Kooij J., Neville S. (2004) The fuzzy relation between acoustic and trawl surveys in the North Sea. ICES Document, CM 2004/R: 04. 35 pp.
Mackinson S., Vasconcellos M., Newlands N. (1999) A new approach to the analysis of stock-recruitment relationships: "Model-free estimation" using fuzzy logic. Canadian Journal of Fisheries and Aquatic Sciences 56:686699.
MacLennan D.N., Fernandes P.G., Dalen J. (2002) A consistent approach to definitions and symbols in fisheries acoustics. ICES Journal of Marine Science 59:365369.
MacLennan D.N. and Simmonds E.J. (1992) Fisheries Acoustics(Chapman and Hall, London) 324 pp.
McNeill D. and Freiberger P. (1993) Fuzzy Logic(Touchstone, New York) 320 pp.
McQuinn I. H., Simard Y., Stroud W. F., Beaulieu J. L., McCallum B., Walsh S. (1999) An adaptive integrated acoustic-trawl survey on Atlantic cod. ICES Document, CM 1999/J: 11. 22 pp.
Michalsen K., Aglen A., Somerton D., Svellingen I., Ovredal J. T. (1999) Quantifying the amount of fish unavailable to a bottom trawl by use of an upward looking transducer. ICES Document, CM 1999/J: 08. 19 pp.
Michalson K., Godø O.R., Fernö A. (1996) Diel variation in the catchability of gadoids and its influence on the reliability of abundance indices. ICES Journal of Marine Science 53:389395.
Myers R., Bridson J., Barrowman N.J. (1995) Summary of worldwide spawner and recruitment data. Canadian Technical Report of Fisheries and Aquatic Sciences 2020: iv + 327 pp.
Neville S., Helljvik V., Mackinson S., van der Kooij J. (2004) Using artificial neural networks to combine acoustics and trawls in the Barents and North Seas. ICES Document, CM 2004/R: 05. 19 pp.
Ona E. and Godø O.R. (1990) Fish reaction to trawling noise: the significance for trawl sampling. Rapports et Proces-Verbaux des Reunions du Conseil International pour l'Exploration de la Mer 189:159166.
Ona E. and Mitson R.B. (1996) Acoustic sampling and signal processing near the seabed: the dead zone revisited. ICES Journal of Marine Science 53:677690.
Ona E., Pennington M., Vølstad J. H. (1991) Using acoustics to improve the precision of bottom trawl indices of abundance. ICES Document, CM 1991/D: 13. 11 pp.
Petitgas P., Masse J., Beillois P., Lebarbier E., Le Cann A. (2003) Sampling variance of species identification in fisheries acoustic surveys based on automated procedures associating acoustic images and trawl hauls. ICES Journal of Marine Science 60:437445.
Report on echo trace classification. In Reid D.G. (Ed.). ICES Cooperative Research Report (2002) 238: 107 pp.
Xie X.L. and Beni G. (1991) A validity measure for fuzzy clustering. IEEE Transactions of Pattern Analysis and Machine Intelligence 13:841847 (Reprinted in Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. Ed. by J. C. Bezdek. Institute of Electrical and Electronics Engineers, New York. 539 pp.).[CrossRef]
Zadeh L.A. (1965) Fuzzy sets. Information and Control 8:338353.[CrossRef][Web of Science]
Zadeh L.A. (1973) Outline of a new approach to the analysis of complex systems and decision processes. IEEE Transactions on Systems, Man, and Cybernetics 3:2844.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||












