Using Fuzzy Logic Algorithms and Growing Hierarchical Self-Organizing Maps to Define Efficient Security Inspection Strategies in a Container Terminal
- https://doi.org/10.2991/ijcis.d.200430.001How to use a DOI?
- Container terminal, Port: Security inspection, Fuzzy logic, Growing hierarchical self-organizing map
Maritime transport is one of the oldest methods of moving various types of goods, and it continues to have an important role in our modern society. More than 20 million containers are transported across the oceans daily. However, this form of transportation is constantly threatened by illegal operations, such as the smuggling of goods or people and merchandise theft. Port security departments must be prepared to face the different threats and challenges that accompany the use of innovative techniques and devices to achieve efficient inspection strategies. Two inspection strategies are presented in this study. The first strategy is based on fuzzy logic (FL), and the second strategy is based on the growing hierarchical self-organizing map (GHSOM) approach. The weight variation and security index (SI) of a container and the readings from certain technologies, such as radio-frequency identification (RFID) and X-ray scanning, are considered as the input data. To minimize the inspection time and considering the costs associated with the security inspections of containers, the results of both inspection strategies are compared and analyzed. The findings indicate there is potential for improving the effectiveness of security inspections by employing both techniques, and the specific relevance in the case of GHSOMs is discussed.
- © 2020 The Authors. Published by Atlantis Press SARL.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
Container transport has an important role in global supply chains and has become increasingly important around the world by contributing to economic development. However, considerable security vulnerabilities have emerged .
A container terminal is a complicated system with several interrelated components and different interconnected operations, such as security inspections, which should be harmoniously executed to avoid delays in the corresponding inspection times.
Security inspections can add costs, delays and uncertainties during the transport process. The disruptions in the supply chain caused by delays in the inspection area of a container terminal can be disastrous and have cascading consequences . In addition, container transport can be used for illegal operations, such as the smuggling of goods and people, and can be employed by terrorist organizations to transport weapons of mass destruction or biohazards .
Port authorities are increasingly making demands regarding the data required for containers as they provide information about their content, country of originand shipping company. The daily analysis of container data can be a difficult process; thus, the following scientific question should be answered: Is it possible to reduce the number of containers assigned to manual inspection in ports and simultaneously improve the systems for the detection of containers that transport illegal material via new technologies without increasing the cost or time spent on inspection?
As a hypothesis based on this question, we propose that the use of artificial intelligence techniques, which have not yet been incorporated into the security inspection systems of container port terminals, can improve the process efficiency and possibly reduce or at least maintain the corresponding time and cost. Thus, the goal of this investigation is to demonstrate the possibility of increasing the detection of illegal containers (containers with illegal material or containers whose merchandise has been stolen) without increasing the cost or time of the inspection processes. The fusion of information and computational algorithms will enable the automatic identification of threats and the presentation of the relevant data to an operator to provide decision support regarding the classification of containers.
Methodologies that are based on artificial intelligence (fuzzy logic (FL) and the growing hierarchical self-organizing map (GHSOM)) and the extensive information associated with containers and their processes, including information from the technological devices that are currently used in their surveillance, are employed to develop tools for decision-making that automate the process and minimize manual inspections without reducing their reliability.
In addition, the use of the container weight as an additional decision variable in the early stages of the container inspection process is a novel proposal that arises from the new regulations that have been promoted by the International Maritime Organization (IMO) since 2016 as part of the new implemented measures for the verification of the gross mass of full containers in the Safety of Life at Sea (SOLAS) convention. These measures have been put in place due to the numerous container ship accidents caused by the excessive weight of containers. Consequently, this information could be appropriately integrated into the inspection strategy in the near future.
In this way, two new inspection strategies are developed based on FL and GHSOM methods. In our case, these strategies employ the container information along with the innovative use of the weight readings (which are currently not incorporated into security procedures) and container security indices (SIs) as well as radio-frequency identification (RFID) and nonintrusive technologies.
There is some scientific literature that deals with RFID technology, the SI of containers and nonintrusive inspection techniques for containers (e.g. see the English and Zuver ). However, the integration of these elements in one approach has not been addressed. In addition, the consideration of the FL and GHSOM approaches is novel in the scientific literature regarding their application to container inspection at ports.
The structure of the paper is as follows: A literature review of the related studies is presented in Section 2. Section 3 presents a general inspection strategy, and Section 4 details the FL and GHSOM methods. The procedures for generating the experimental data are explained in Section 5, and the results of both models are presented in Section 6. In Section 7, the discussion and final conclusions are provided.
2. LITERATURE REVIEW
The process by which security inspections are performed in container terminals is important because this process affects both the maritime supply chain and the associated costs. In this section, a scientific literature review of the inspection processes in container port terminals and the FL and GHSOM approaches is presented.
2.1. Container Inspection in Port Terminals
Among the different operations performed in a container terminal, the security inspection is one of the most important operations. The delays in the inspection area of a container terminal are primarily attributed to the manual inspection of containers. As these manual inspections require several hours per container, the manual inspection of all the containers is not viable in terms of the general efficiency of container terminal operations. Classifying the containers using a certain inspection strategy helps to reduce the number of containers that will be manually inspected, thereby reducing the time of the operations in the container terminal. By investigating different algorithms, methods and approaches, as well as the implementation of FL, we were able to improve the classification of containers and minimize the inspection times and costs.
Bakshi et al.  analyzed the impact of two important inspection initiatives: the Container Security Initiative (CSI) and the Security Freight Initiative (SFI). Boros et al.  developed a linear decision tree model to obtain the optimum sequences of inspection strategies. Boros et al.  considered a combination of decision trees and inspection systems by enumerating efficient inspection policies. Longo  designed operationally effective practices and policies to improve the flow of containers both toward the inspection zone and within the normal operations of a container terminal. Lee et al.  presented a genetic algorithm for optimizing the percentage of containers that are examined and the sequence of the container movements, which minimizes time-delay costs. Harris et al.  performed simulations to determine the necessary inspection resources for minimizing the interruption caused by an increase in security inspections in a container terminal.
Elsayed et al.  presented several optimization approaches to simultaneously determine the optimal levels of the sensor threshold and the sequence of the inspection. Young et al.  presented a study that corresponds to an extension of the study by Elsayed et al. , in which, unlike the latter, they present a multiobjective optimization approach for determining the optimal management of sensors and their threshold levels, considering the total costs. van Weele and Ramirez-Marquez  presented an optimization technique for developing an inspection strategy that establishes an inspection rate of suspect containers that minimizes the inspection costs. Riahi et al.  employed a dataset to establish the values of the reliability percentages, both for the country of origin and the shippers and container terminals; they obtain the SIs of the containers.
Ramirez-Marquez  presented an inspection strategy that introduces different types of reliability and cost measures. An evolutionary optimization approach that is known as a probabilistic solution discovery algorithm is applied to generate an optimal inspection strategy.
Concho and Ramirez-Marquez  developed a holistic evolutionary algorithm for identifying the optimal threshold values for every sensor and the optimal configuration for the inspection strategy. Ma et al.  employed the maximum likelihood (ML) estimation method to identify the efficiency factors for inspection, which improves the quarantine and clearance processes of the containers in a port. Wang et al.  developed a stylized queueing model with novel features related to the security checkpoints to analyze policy initiatives. Wang et al.  discussed an inspection investment planning problem for the international container terminal at the Dalian Port using a simulation method. They proposed a framework that combines an arena-based simulation model that considers various types of container ships and flexible container truck scheduling and routing.
Table 1 presents a summary of the investigations regarding the optimization methods for improving the security of a container terminal.
|Reference||Modeling (Algorithms)||Experimental Data Size||Main Contribution|
|||Simulation models||Two container terminals||Effect of inspections on the flow of containers|
|Container inspections in container terminals||Minimize the inspection costs and inspection error rate|
Dynamic programming algorithms
|Port inspections represented by decision trees||Establish some effective properties for inspection systems, which minimize the cost|
|||Holistic evolutionary algorithmGeneral decision tree model||Container inspection strategy||Minimize the total cost of inspection while maintaining a user-specified detection rate for “suspicious” containers|
|||Port-of-entry problem||Small number of inspection stations||Optimal sensor threshold levels|
|Alabama Container Terminal||Minimize the interruptions from the increased security inspections of containers in a terminal|
|||Genetic algorithm||Operations of a container terminal||Optimize the inspection process and the sequence of the movements of containers in the yard; minimize the total costs|
Design of experimental techniques
|Container terminal||Integration of the security procedures in the normal operations of the container terminal|
|||Factor conception model
Structural equation model
|Inspection and quarantine clearance efficiency in Shanghai, China||Provide a theoretical basis for the analysis of the internal economic effectiveness|
|||(n + 1)-echelon decision tree
General decision tree model
|Container inspections in container terminals||Minimizes the total cost of inspection while maintaining a user-specified detection rate for “suspicious” containers|
|||Bayesian network (BN)
Analytic hierarchy process (AHP)
|Case study||Evaluate the security score of a container|
|Modeling of security inspections with four types of sensors||Inspection rates for suspicious containers|
|||Queueing model with novel features||Security-check waiting lines for screening cargo containers||Provide a modeling framework to understand the economic trade-offs embedded in the container-inspection decisions|
|||Arena-based simulation model
Visual Basic for Applications
|International container terminal at Dalian Port||Address an inspection investment planning problem for the international container terminal at Dalian Port using a simulation method|
Analysis of variance (ANOVA)
|They considered two suspect containers per 10,000 containers||Determine the optimal levels of sensor layouts and thresholds|
Approaches for improving the security of a container terminal.
2.2. Fuzzy Logic
FL allows us to deal with nonaccurate information by considering the data as fuzzy sets. The fuzzy sets combine different rules to define the actions. Thus, control systems based on FL are able to combine the input variables by applying groups of rules that lead to one or more output values .
Systems based on FL can be applied to nonlinear or partially defined problems as neural networks. However, in contrast to neural networks, FL allows for the easy implementation of expert knowledge by formalizing the sometimes ambiguous knowledge of experts. In addition, FL allows for the design of inexpensive and quick control and decision systems.
The application of an FL algorithm can be described by the following three steps:
Fuzzification, where the input values are converted to fuzzy values
Inference, which is a process based on the logic rules
Defuzzification, where the fuzzy variables are reconverted, and a decision is made
FL has been used as a tool for processing large amounts of information, in which the data can have an associated degree of partial set membership. FL methods are the main actors in some investigations of system control; in other studies, FL methods aid in decision-making.
In Starczewski , an efficient fuzzy logic system (FLS) that is based on triangular type-2 fuzzy sets is designed. This FLS provides a new method for reducing computational complexity in t-norm operations that is extended to triangular type-2 fuzzy sets. Motepe et al.  presented an FL method and experimental investigation. This study was associated with real measurements of the South African power system network. Magudeeswaran and Ravichandran  presented an FL-based histogram equalization (FHE) method to enhance image contrast to highlight the details of a hidden image or increase the image contrast with a new dynamic range.
Liang et al.  used fuzzy set theory to construct an optimum output quantity decision model to obtain the maximum profit of a duopoly market. Huerta et al.  presented an FL-based preprocessing approach that consists of two main steps. First, the approach employs fuzzy inference rules to transform the gene expression levels of a given dataset into fuzzy values. Second, the approach applies a similarity relation to the fuzzy values to define the fuzzy equivalence groups. Each group contains similar genes, which assists with the selection of an essential subset of genes for the classification and analysis of microarray data. Hsueh  used the Delphi method and FL theory to develop a quantification assessment model that is based on the qualitative analysis used to evaluate the results and influences of participation in environmental protection education and green community development by residents of the Taiwan community.
Table 2 presents a summary of the investigations regarding the FL method.
|Reference||Modeling (Algorithms)||Experimental Data Size||Main Contribution|
|||Fuzzy logic||Analysis of microarray data||Gene selection|
|||Fuzzy logic||Community residents' participation in environmental protection education||Assess the results and influences of community residents' participation in environmental protection education on green community development|
|||Fuzzy decision environment||Duopoly market||Construct an optimum output quantity decision model that aims to maximize the profit of a duopoly market|
|||FL-based histogram equalization||Images||Unveil the hidden image details or increase the image contrast with a new dynamic range|
|||FL||South African power systems network||Determine a distribution power systems' loading measurement accuracy|
|||FLS based on triangular type-2 fuzzy sets||t-norm operations||Provide a new method for computational complexity reduction in t-norm operations extended to triangular type-2 fuzzy sets|
FL, fuzzy logic; FLS, fuzzy logic system.
Approaches for fuzzy logic method.
2.3. The SOM and GHSOM
Self-organizing maps (SOMs) were developed by Teuvo Kohonen in the 1990s (see Kohonen, 2001 for a good introduction to SOMs) as a continuation of the competitive networks proposed by Von Der Malsburg. SOM networks have been successfully applied to a large variety of problems, such as pattern classification, size reduction, process monitoring and data mining, among others .
A SOM obtains the statistical characteristics of the input data which is then applied to a wide data classification field . However, the effectiveness of traditional SOM models is limited by the following issues:
The GHSOM has an adaptive architecture without supervision that focuses on clustering data. When the distribution of the data increases in a hierarchical manner, the approach allows for its hierarchical decomposition and exploration of the data clusters in a horizontal manner . This self-organizing model (GHSOM) has a hierarchical architecture that is divided into layers; each layer is composed of different SOMs, and the size of each SOM is automatically determined during the unsupervised learning process . The main advantage of a GHSOM compared with a traditional SOM is that the trial and error are removed from the training process. An ideal topology is formed in an unsupervised manner based on the training data .
Palomo et al.  presented a new approach for analyzing and visualizing network forensics data (network forensics is an area of research that collects information regarding crimes that involve digital evidence) based on GHSOMs. Ippoliti and Zhou  proposed an adaptive GHSOM approach (AGHSOM) for network anomaly detection. Chattopadhyay et al.  proposed a GHSOM that improves the cell formation problem (CFP) of a cellular manufacturing system. Chan and Pampalk  developed a GHSOM Toolbox for MATLAB, which has an advantage in visualization due to its capability of presenting classes and subclasses of similar data. By combining the GHSOM with mutual information, Zhang et al.  proposed a new intrusion detection method for detecting unknown network attacks.
Table 3 provides a summary of investigations on the SOM and GHSOM approaches.
|Reference||Modeling (Algorithms)||Experimental Data Size||Main Contribution|
|||GHSOM Toolbox for MATLAB||Determine the size of the SOM Presenting classes and subclasses of similar data||Development of the GHSOM Toolbox for MATLAB|
|||SOM approach GHSOM||CFP of cellular manufacturing system||Development of optimum machine-part cell formation algorithms|
|||GHSOM||Set of data||Grow in terms of map size and a three-dimensional tree-structure to represent the hierarchical structure in a data collection during an unsupervised training process|
|||Growing hierarchical tree SOM (GHTSOM)||Set of Internet meaning data||Allow the network to adapt the topology of each layer of the hierarchy to the characteristics of the training set|
|||AGHSOM||Set of online data||Network anomaly detection|
|||GHSOM||Network forensics||Improve the visualization of network traffic data|
|||GHSOM method||Set of data||Clustering of input data|
SOM, self-organizing maps; GHSOM, growing hierarchical self-organizing map; CFP, cell formation problem.
Approaches for SOM and GHSOM.
In Section 2.1, the different investigations within the optimization field for improving the security inspections in port terminals were analyzed. Although different models, simulations and optimization designs have been employed to achieve an optimal inspection strategy, the inspection strategies based on FL or SOM and GHSOM approaches have not been designed considering the container weights as input data, which is our research contribution. Both strategies are compared and analyzed to obtain the most efficient inspection strategy when classifying the containers as follows: suspicious containers will be manually inspected; probably suspicious containers will be inspected by X-ray scanning; and not suspicious containers will be released to continue their path through the container terminal.
3. THE INSPECTION STRATEGY
The following values were employed for this investigation: the weight variation among the containers; the values provided by the RFID technology that indicate if the container could have been opened; the SI values from Riahi et al. , which are shown in Table 4; and the values obtained from the simulation of X-ray scanning at the control points. These variables are combined to design an efficient algorithm that is able to overcome the limitations of considering each variable in an independent and individualized way and provide a very satisfactory classification of containers (suspicious and nonsuspicious).
Security percentages of the countries of origin and carriers/ports.
Safety management regulation is an important complement to market forces to establish a sufficient safety level in high-risk industries , which explains why the IMO implemented measures for the verification of the gross mass of full containers in the SOLAS convention  due to the numerous container ship accidents caused by the excessive weight of containers. The new regulation, which has been in effect since July 1, 2016, seeks to avoid accidents caused by an improper weight distribution by requiring the verification of the container weights. This information is reflected in the documentation. These regulations enable the use of the container weight as input data for our investigation; this variable has not been included in inspection strategies for decision-making.
The RFID technologies are very reliable but cannot guarantee 100% security, which indicates that an inspection strategy based on RFID technologies would not provide optimal results when classifying containers as suspicious or not suspicious. In addition to these technologies, the container weights and other technologies and indicators will be considered in this study.
The RFID technologies have the following limitations :
The collisions that occur when trying to simultaneously read several tags cause data loss.
The RFID tags can be damaged during container transport.
The weather conditions can affect the RFID tag and cause the transmission of inaccurate readings regarding the opened or closed condition of the container.
The containers that are considered to be suspicious can be subjected to X-ray scanning as in the image discrimination system proposed by , where the theory of using two X-ray energies ( and ) was developed to analyze objects and extract their atomic information. This system enables the load of a container to be classified according to the image attenuation range and will therefore provide a final classification of the containers; suspicious containers will require a manual inspection, while the not suspicious containers will be cleared from the container terminal.
Dual-energy imaging comprises a technique that scans objects with dual X-ray energy layers, and . In our case, the attenuation coefficients of MeV and MeV are employed (MeV – Megaelectron-volt), which are the values provided by the National Institute of Standards and Technology (NIST, U.S.)
3.1. The General Inspection Strategy
Inside a container terminal, all the containers undergo an initial inspection, the results of which are used to classify them as “suspicious” or “not suspicious.” Based on this classification, a container will be subjected to additional controls that will enable its entrance or clearance or determine whether it has to be manually inspected.
The use decision trees for inspection strategies was first suggested by Boros et al. , later in a more general way the process was represented as a decision tree by Van Weele and Ramirez-Marquez , where the results of each inspection determine the path of the container through the tree.
The decision tree models presented in this document consider different factors in the inspection process to improve the strategy, such as the weight and security score of a container and the RFID readings. The decision trees for each optimization method are shown in Figures 1 and 2.
Both trees are similar for the inspection and classification of a container. As shown in Figure 1, the largest difference between the two strategies is that the RFID reading from the container is analyzed in node 1 to indicate if the container has been opened during transit. If the container was opened, it is classified as a suspicious container and will directly undergo manual inspection; otherwise, it will be classified as probably suspicious and will pass to node 2, where the classification of the container can be obtained by analyzing the container input data, such as the weight variation and the ) values, by applying FL. In this manner, the containers will be classified as suspicious, not suspicious and probably suspicious and will pass to node 3 where the weight variation, SI data and X-ray results (Se) will be reanalyzed by FL.
This separation of the variables in the decision tree is attributed to the fuzzy nature of the three measures (weight variation, SI and X-ray results), which contrasts the use of the binary RFID variable.
To generate the first GHSOM or SOM 1 level, the input data, which consists of the weight variation, SI values and the RFID readings, as shown in Figure 2, are simultaneously analyzed in the first step. Thus, first, the containers are classified into three sets: M, for the containers that are suspicious that will directly proceed to manual inspection; P, for containers that are not suspicious, which will leave the inspection area; and S, for the containers that will be subjected to X-ray scanning, since there is not sufficient information to determine if they are suspicious. The SOM is employed for data clustering and visualization, which enables the classification of the variables regardless of whether they are fuzzy or not.
4. DECISION SUPPORT METHODOLOGIES FOR SECURITY INSPECTION BASED ON ARTIFICIAL INTELLIGENCE
In this section, we detail the proposed methodology for security inspection strategies based on the FL and GSHOM approaches.
4.1. The FL Model
The proposed inspection strategy FL-based process is explained next.
As indicated, is an artificial intelligence technique that facilitates or enables working with information that is inaccurate and poorly defined. is used as a calculation tool for truth criteria; it is based on a scale of values from the falsest value to the truest value and provides a quantitative result that ensures the selection of an alternative that is closest to the truth and considers the attributes that it satisfies as a basis to attain a certain goal . In this manner, is a robust method that does not require a substantial amount of information.
We will use FL to sharpen the inspection strategy results. The input data, weight variation and security score (SI) are considered in nodes 2 and 3, where a new data input will be added for the X-ray (Se) information to obtain a final classification that minimizes the inspection times and the manual inspections.
4.1.1. The data and variables of the FL model
In the model of Figure 1, node 1 is a classic logic decision. If a container has been opened, it will be manually inspected. If a container has not been opened, it will proceed to node 2, where it will be classified again due to the fuzzy algorithm. The variables to be analyzed by the algorithm in nodes 2 and 3 of the decision tree are defined in the following section.
The structure followed during the decision processes starts with the statement of the input and output variables. Then, the membership functions for each input are explained, and the fuzzy linguistic variables are detailed by means of the structure rules (IF x AND y THEN z) creating the rule matrix. Finally, to solve the problem we make use of the Root-Sum-Square (RSS) method.
The calculations for all the fuzzy variables were performed using the MATLAB Fuzzy Toolbox. This toolbox allows the creation and editing of fuzzy inferences with graphical tools or command line functions; they can also be generated with the adaptive cluster techniques in the toolbox.
4.1.2. Decision process in node 2
Two fuzzy variables are defined: the weight and security score of the container:
: the weight variation
: the container security index
: If the container is not suspicious
: If the container is probably suspicious
: If the container is suspicious
Input 1: the weight variation
The weight variation input variable, , is constructed for each container, , by using two weight values: and , where is the weight value of a container at the origin (first measurement of the container weight), and is the weight value at the destination port (last measurement of the container weight). This variable evaluates if the weight of container has changed during the trip from the origin to the destination. This fact can alert against the theft of goods in case of a reduction of the original container weight, that is ; or alert against suspicious (and possibly illegal) introduction of goods in the container during the trip when , being a tolerance threshold. Consequently, we define the following three cases:
A variation between the values above the threshold suggests that the container should be considered as suspicious. This could be due to an increase in the weight (that could be associated with adding some illegal goods for smuggling) or a decrease in the weight (that could be associated with the theft of goods from the container).
Input 2: container SI
The container SI will be
Table 5 presents the matrix of rules that determines the membership.
Rule matrix for Node 2.
Structure rules and the rule matrix
The RSS method is applied to solve the system. This approach combines the effects of all the rules, scales the functions according to their dimensions and calculates the fuzzy centroid of the composed area.
Using the RSS approach, the values of the probably suspicious containers (Sc) can be obtained using Equation (7); the values of the not suspicious containers (P) can be obtained using Equation (8); and the values of the suspicious containers (M) can be obtained using Equation (10). These equations are incorporated into the defuzzification function for the final decision (Figure 3c).
4.1.3. Decision process in node 3
After passing through node 2, all the containers that are classified as probably suspicious will be inspected by a nonintrusive X-ray scan in node 3, where they will be classified again using the model, considering their weight, security score and X-ray result. The containers will be classified into suspicious or nonsuspicious.
For node 3, input 1 and input 2 (the weight variation and SI of a container, respectively) will be the same as in node 2. The container will be scanned by X-ray and a new input is defined for the X-ray analysis data and the output of node 3.
: X-ray scanning result
The output will be as follows:
: For the containers that pass the inspections
: For the containers that will undergo a manual inspection
Input 3: X-ray scanning result
The X-ray scanning result will be
Table 6 presents the matrix of rules that determine the membership.
Structure of rules for node 3.
4.1.4. Structure rules and the rule matrix
The RSS method is applied to solve the system and obtain the values of the suspicions containers (M) in Equation (13) and not suspicious containers (P) in Equation (14), which are incorporated into the defuzzification function for the final decision (Figure 5):
4.2. The GHSOM Model
As previously explained, the GHSOM rules are networks formed by several SOM networks whose size is automatically determined during the unsupervised learning process . In this section, their operation and implementation are described, beginning with the SOM network training process.
4.2.1. SOM network training
A SOM is an unsupervised neural network model that can be used for data clustering and visualization applications . An SOM can project high-dimension patterns onto a low-dimension topology map. The SOM maps consist of a one-dimensional (1D) or two-dimensional (2D) node grid. These nodes are also referred to as neurons. The weight vector of each neuron has the same dimension as the input vector.
These neural networks classify the unsupervised input data, and their architecture consists of two layers: the first layer, which is also called the competition layer, consists of the learning nodes, which contain information about the resulting representation, and the input nodes, which represent the original vectors during the training process. All the elements of the first layer are connected to all the elements of the second layer.
Figure 6 shows the basic structure of an SOM network, where represents the weights assigned to each node of the competition layer, and represents the input nodes.
The classic SOM network learning algorithm can be formulated as follows (for an in-depth analysis of the algorithm, refer to ):
The synaptic weights, , are initialised as small absolute values or, in our case, default values. An input neuron vector, , is randomly chosen. The winner neuron is determined by calculating the Euclidean distance between the previously chosen input neuron vector, X, and the synaptic weight vector
The synaptic weights of the winner neuron and its neighboring neurons are actualized according to the weight actualization rules.
The learning rate determines the neuron weight variation; it is a time-decreasing function that is actualized with a linear function and its values fall between 0 and 1.
The neighborhood function is used to determine which neurons, are neighbors of the winner neuron, for each iteration .
The neighborhood function, decreases with time and depends on a parameter called the neighborhood radius , which represents the size of the current neighborhood.
The simplest representation for the neighborhood function is step-like:
A neuron is in the neighborhood of the winner neuron if the Euclidean distance is smaller than .
The algorithm is repeated from step 2) to the required number of iterations, , or until .
4.2.2. GHSOM network training
In this section, we explain the procedures for training the SOM 1 and 2 networks, in which neuron identification was performed to obtain the classification results of the container. Figure 7 shows the algorithm designed to determine the size of the networks.
After the network has been trained and its neurons have been identified, the network is tested and evaluated, and the container data are introduced.
We used the MATLAB GHSOM Toolbox to train our network; this toolbox increases the functionality of the SOM Toolbox . The use of the basic functions of the SOM Toolbox to create the GSHOM networks provides a more robust and standardized network than the SOM Toolbox. Once the network is trained, the decision algorithm identifies the hexagons to then use to evaluate the network results. If the classification error rate of the containers is less than 0.5% (which indicates that at least 50 containers have been misclassified), then the network is considered to be satisfactory; otherwise, the size of the network will be increased and retrained.
For the SOM 1 network, the algorithm determined an optimal size of 10 × 10. The algorithm determined an optimal size of 20×20 for the SOM 2 network.
This phase is the first step of the inspection strategy where the network classifies the input data, which consist of the weight variation, ∆W, and . The result of the decision algorithm that determines the network size is shown in Figure 7, and the trained SOM 1 network is depicted in Figure 8. This figure shows the final configuration of the trained network that satisfies the error parameter conditions.
In Figure 8, the blue hexagons represent the neurons, the red lines connect the neighboring neurons, and the other colors represent the distances between the neurons; the darker colors represent longer distances, and the lighter colors represent shorter distances.
After properly training the neurons to achieve a container classification error rate of less than 0.5% (fewer than 50 misclassified containers), the containers that are assigned to each neuron of the SOM 1 network are identified, that is, the neurons that will classify the containers as suspicious, probably suspicions or not suspicious, depending on their information, are identified.
Figure 3a and 3b show the membership functions for the and the SI variables, respectively. For the hexagon classification, we use the same previously defined membership functions since they provide the boundaries for each of these variables due to their fuzzy nature and because we ensure that both inspection strategies employ the same information. These boundaries define the risky, safe and no-information or zero zones of each variable. For the RFID variable, a membership function is not needed, since it is a binary variable has a value of 0 if the container was opened and 1 otherwise. Using this information, the type of containers that are assigned to each neuron can be identified.
The neuron classification in the SOM 1 network, the training of which was depicted in Figure 8, is visualized in Figure 9. The containers associated with neurons classified as will be manually inspected; the containers associated with neurons classified as will leave the inspection zone since they are considered not suspicious; and the containers associated with neurons classified as will proceed to the following verification level of the GHSOM (SOM 2), where they will undergo an X-ray scan.
In the following level of the GHSOM consisting of SOM 2, the ∆W, and input variables for the containers that had been classified as probably suspicious () and the variable , which provides information about the X-ray inspection of the container, will be analyzed again.
With the variables ∆W, and , the state of all containers, suspicious or not suspicious, cannot be defined. As a result, a new network (SOM 2) is trained. The input data are the output data of the SOM 1 network, namely, the containers classified as probably suspicious. The ∆W, , variables are employed by the SOM 2 network as well as the X-ray scanning result obtained from the containers ().
Figure 10 shows the results of the decision algorithm that defines the size of the network (see the flow chart in Figure 7) for SOM 2, which defines a 20 × 20 network size. Again, this configuration is the final configuration that meets the error requirements stated in the algorithm to define the network size.
In Figure 10, the blue hexagons represent the neurons; the red lines connect the neighboring neurons, the different colors represent the distances between the neurons the darker colors represent longer distances; and the lighter colors represent shorter distances. Unlike in SOM 1, the distances between the neurons are very small due to the size of the network and the variable values.
The type of neuron that will classify the containers as suspicions or not suspicious, depending on the variables ∆W, SI, RFID and , is identified.
As previously explained, to identify which type of containers are assigned to each neuron, the membership functions were employed for the ∆W and SI variables. For the new SOM 2 network, the membership functions for are shown in Figure 3a and 3b and Figure 4c, respectively.
Figure 11 shows the neuron positions for the container classification obtained by the algorithm, given the network size that was shown in Figure 10. The containers located in neurons classified as will be manually inspected, and the containers located in neurons classified as will leave the inspection zone.
5. ANALYSIS OF THE RESULTS
The results of each inspection strategy are analyzed in this section. Each strategy uses the same data for the 10,000 containers as input (see Annex 1 for the details related to the data generation for the experimentation). The efficiency of each strategy based on artificial intelligence is observed for the classification of containers. The ability to minimize the cost and times of the inspection zone in a container terminal and the ability to minimize the number of illegal containers that are not detected in the inspection zone are observed, as we presented in the initial scientific equation. The novel introduction of the weight variation variable, ∆W, is very useful and discriminative for the classification of the container input data since both methods can be employed to make decisions for the classification of each container.
5.1. The Base Case
The same data for 10,000 containers were used for each inspection strategy. The results of the inspection strategies, the strategy based on FL and the strategy based on GHSOM networks, are presented in the following section.
5.1.1. The FL approach
In node 1, the RFID tags of the 10,000 containers are analyzed, of which 732 containers were found to have been illegally opened and were classified as suspicious and manually inspected. Of the 732 suspicious containers, 263 containers contained some smuggled merchandise, and 469 containers had part of their merchandise stolen.
A total of 9,268 containers were classified as likely suspicious and were used as the input data of node 2. Using the algorithm, an analysis of the and variables of each of the 9,268 containers was performed.
The approach classified 478 containers as suspicious; these containers contained some type of illegal merchandise. Simultaneously, 5,792 containers were classified as not suspicious and continued their path through the terminal. However, two of these containers carried smuggled merchandise. The remaining 2,998 containers continued to be analyzed in node 3.
In node 3, 2,998 containers were analyzed and passed through an X-ray inspection. With the ∆W and data of each container, 4 containers were classified as suspicious and had to be manually inspected; they contained illegal merchandise. A total of 2,994 containers were classified as not suspicious and continued their path through the container terminal. From these containers, 9 containers carried some type of illegal merchandise that could not be detected by the inspection strategy. To calculate the error rate, we employ the following equation:
Thus, the error rate is 0.11%. A summary of the inspection strategy results is shown in Figure 12. This output error rate, for both nodes 2 and 3, is attributed to the notion that the values used to classify the containers were very small and were almost undetectable by the X-ray scan, weight variation or the SIs. This finding is observed in Figures 13 and 14, where the weight variation is given in tons .
Specifically, the classification error rate of the inspection strategy was attributed to the very small values of different variables. As shown in Figure 13, two containers had high SI values and were illegal but were classified as not suspicious when a small weight variation existed between the two containers. As shown in Figure 14, these 9 containers were classified as not suspicious when they were illegal because their weight variation was very small and their SI values were very high or the values of the X-ray simulation were low; that is, if two of the container variables had values similar to those that are considered as safe, the system considered these containers not suspicious.
The inspection strategy based on FL consists of three steps: (i) the first node detects containers that were forced open as determined by the RFID data, (ii) the second node makes use of the ∆W and variables to further analyze those containers with a positive RFID result to determine whether the smuggling or theft of goods was possibly carried out at the origin or the electronic seal was replaced and falsified (at node 2, only those containers with all the variables at the maximum level of security are classified as safe), and finally, (iii) at the third node, a nonintrusive technology (X-ray) is used to classify the last risky containers.
5.1.2. The GHSOM approach
In SOM 1, the ∆W, and variables of the same 10,000 containers of the previous case were analyzed, of which 1,216 containers were classified as suspicious, 3,837 containers were classified as not suspicious, and 4,947 containers continued to be analyzed in SOM 2. Of the 1,216 suspicious containers, 747 containers carried illegal merchandise and 469 containers had removed or stolen merchandise. Of the 3,837 containers that were classified as not suspicious, none contained any illegal merchandise.
In SOM 2, 4,947 containers were analyzed and subjected to an X-ray scan. With the ∆W, and data of each container, one container was classified as suspicious and had to be manually inspected; it contained illegal merchandise. A total of 4,946 containers were classified as not suspicious and continued their path through the container terminal. Of these containers, 8 containers contained some type of illegal merchandise that could not be detected by the inspection strategy. These 8 containers represent the error rate of the inspection strategy, which is 0.08%.
A summary of the inspection strategy results is shown in Figure 15. This error rate in the SOM 2 output is attributed to the fact that the illegal merchandise in the container was not easily detected by the X-ray scanning model, which hindered the analysis of the weight variation and value.
Figure 15 shows the complete inspection strategy and the results. Note that there are no errors in the classification obtained by SOM 1 for the suspicions containers and not suspicions containers. The classification obtained by SOM 2 has an error rate of 0.08%, as the amount of illegal merchandise was not detectable by the X-ray scans in this case.
Figure 16 shows the SOM 1 output; the data are grouped into two data clouds defined by the variable. The network is capable of detecting and correctly classifying all the containers with , which are containers that were illegally opened. All the containers with a significant ∆W are classified as suspicious, and the containers with a small ∆W value and a value near zero (refer to Figure 3b) are classified as probably suspicious. The containers whose parameters are in the safe zone are classified as not suspicious.
The SOM 2 output is given in Figure 17. A second analysis of the parameters was necessary to detect another suspicious container from the 8 remaining containers in this inspection strategy point. The combination of the ∆W and variables was necessary to detect this container, since all the variables were within the zero threshold; that is, they do not provide sufficient information to identify the type of the container.
6. SUMMARY OF THE RESULTS
In this final summary, we follow a specific table design that allows us to easily visualize the performance of each approach. Each row represents the instances of the predetermined class, and each column represents the instances of the predicted class (or vice versa) . Given a classifier and an instance, there are four possible results as Table 7 shows.
If the instance is positive and it is classified as positive, then it is a true positive (a). However, if it is classified as negative, it is a false negative (b). If the instance is negative and it is classified as negative, then it is a true negative (d). However, if it is classified as positive, it is a false positive (c). Given a classifier and a set of instances, a confusion matrix can be easily constructed (see ).
The rate of true positives and negatives, as well as false positives and negatives, can be calculated using the following metrics (21–24).
The true positive rate of the classification is given by
The false positive rate of the classification is given by
The true negative rate of the classification is given by
The false negative rate of the classification is given by
|P (Passed the Verification)||M (Manual)|
|L (legal)||8,775 (100%)||0 (0%)|
|I (illegal)||11 (0.89%)||1,214 (99.1%)|
Confusion matrix for the fuzzy logic approach.
|P (Passed the Verification)||M (Manual)|
|L (legal)||8,775 (100%)||0 (0%)|
|I (illegal)||8 (0.65%)||1,217 (99.34%)|
GHSOM, growing hierarchical self-organizing map.
Confusion matrix for the GHSOM approach.
The algorithm shows a good capability to appropriately classify the different containers of our case study. All the legal containers were correctly classified in 100% of the cases and did not require any manual inspection with its corresponding cost and time. In the case of the illegal containers, the algorithm showed a very low rate of confusion. Only 11 containers representing 0.89% of the instances were classified as false positives and passed the verification without a manual inspection. Of the illegal containers, 99.1% were appropriately identified for manual inspection. Therefore, the algorithm presents a low failure rate.
Regarding the GHSOM approach, the algorithm shows a very high degree of appropriate classification. First, all the legal containers were correctly classified in 100% of the cases and did not require manual inspection with its corresponding cost and time. Regarding the false negative containers, only 8 containers (0.65%) were inadequately classified and were not subjected to manual inspection. On the other hand, 1,217 illegal containers were appropriately subjected to manual inspection.
The comparison of the approaches shows that both of the algorithms are very good classifiers that perfectly classify the legal containers. Both approaches show a very good level of classification for illegal containers with a very low error rate. In this line, the GSHOM approach showed a slightly better performance.
This study has demonstrated how efficient security inspections can be achieved by increasing the security of container transport and minimizing the time and cost spent by applying two artificial intelligence methodologies, which are based on FL and the GHSOM. The container input data, such as the RFID readings, X-ray scanning results and container security data, were analyzed. A novel contribution of the new IMO regulations was the inclusion of the container weight variation to achieve a better adjusted classification of the containers and reduce the number of suspicious containers that are not detected in the inspection area.
Additionally, the weight sensors in the container terminal work with threshold values between 40 and 20 tons. The sensors recommended in the OIMLR 60 regulation (from the International Organization of Legal Metrology) suggest an accuracy of approximately 250 kg for the sensor working range. It is clear that the weight variation offers significant help in the inspection strategy, but it cannot be used by itself to identify low levels of smuggling. Thus, a combined inspection strategy is proposed, which includes the RFID data, SI and results from a nonintrusive inspection together with the weight variation in an integrated way.
Unlike the data provided by the RFID readings (a binary output variable), the remaining variables were fuzzy (∆W, SI and X-ray variables). Based on the proposed methodologies, inspection strategies can be employed to rapidly classify the containers with a high reliability percentage.
In both algorithms, the use of the weight variation among containers prevents the inspection of all containers and maintains a low error rate or while reducing the inspection time in the system. For the FL algorithm, 7,002 containers do not pass through the X-ray inspection, which prevents 350 hours of inspection. Using the GHSOM algorithm, 5,053 containers do not pass through the X-ray inspection, which prevents 252 hours of inspection. The hours of inspection are calculated considering the inspection time of approximately 20 containers/hour given by (Boukachour et al., 2011).
To compare the capabilities of each algorithm, the same data are employed as inputs and adjustment information in both algorithms. Thus, the information received a priori does not affect the results.
First, the FL algorithm achieves very competent global results, with an error rate of only 0.89%. This error rate is low, and only a small amount of smuggling cannot be detected in this strategy (size smaller than 0.00375 m3), which was the margin established in the X-ray simulation as detectable.
Second, the GHSOM neural network algorithm offers even more promising results, with an error rate of only 0.65% for illegal containers. This capability is attributed to the large classification capacity of these types of algorithms, which indicates that this approach is the better option for minimizing the time and costs in the inspection area of a container terminal and decreasing the error rate.
We conclude that the GHSOM and fuzzy algorithms are very similar to each other in terms of their ability to detect and group the study objects into many different categories. Both of the strategies demonstrated very strong capability for the correct classification of containers, and they achieved similar results in terms of the classification accuracy. The false negative rate was slightly better in the case of the GSHOM, but the difference between the two approaches was very low. However, we recommend the adoption of the GHSOM approach specially when dealing with complex problems due to its better ability to classify the data in very different groups
The improvement in the classification capacity of the GHSOM-based algorithm over that of the FL-based algorithm is due to its intrinsic nature. The fuzzy algorithm uses four variables: ∆W, Se and SI, where each variable is divided into three zones, and the RFID variable that is divided into two zones. In node 3, the algorithm is capable of classifying a container into 27 different groups (three variables divided into three zones). Using the same variables, the GHSOM-based algorithm classified them into the same zones but does not have this limit. In this particular case, the algorithm classifies the containers into 400 different groups (SOM 2 has a size of 20 × 20).
To appropriately analyze a comparison between both approaches, wider alternative experimentation sets should be constructed. This is now one of our future lines of research: the definition of a wide set of experimentation data that closely represents a real situation and considers possible combinations of actions (and also combinations of illegal actions). In this line, a deeper analysis of the intrinsic vulnerabilities of the ∆W and variables should be considered, with particular attention to the variable.
Finally, a detailed study of the cost and time savings at the container terminal attributable to the proposed strategy versus a general (or random) manual inspection strategy would help to identify the advantages of the proposed approaches. Such a study should be conducted using a discrete event simulation approach and should include the saved inspection time, its associated cost savings estimate and an estimate of the consequences of incorrect classifications. This is also a challenging future research direction.
Annex 1. Data generation for experimentation
This annex describes the procedure followed to generate the data for the 10,000 containers that are used for the experimentation.
The data was generated by using the MATLAB “random” function. This function generates random numbers using a probability density function (PDF). We used a normal distribution to calculate the probability density function (PDF) as stated in Equation (25):
The reliability percentages for the country of origin of the containers () and the reliability percentages of the carriers and port () were randomly generated using the “random” function and Equation (25). The generated numbers were then transformed to obtain positive values in an increasing histogram with the following limits:
The container ) was obtained from Table 4. The exact SI value was obtained by interpolating the and values of each container. The results are depicted in Figure 19, and the SI values are between 0.65 and 0.85. This set of values defines the a priori risky and non-risky containers.
In our case study, the RFID variable represents the reading obtained from the electronic seals on the containers. The ancillary variable, , is generated using the “random” function and a distribution (Equation 25). The generated numbers are then transformed to obtain an increasing histogram of positive values whose limits are:
Then, the ancillary variable, , is compared to . If then (not forced open container). If then (forced open container). Higher values of imply a lower probability that it is a container that has been forced open.
is a binary variable that indicates whether a container contains illegal goods. To construct the set, we define another ancillary variable, , that is generated using the “random” function and a PDF distribution (Equation 25). The generated numbers are then transformed to obtain an increasing histogram of positive values whose limits are:
Then, the ancillary variable, , is compared to . If then (container contains illegal goods). If then (container does not contain illegal goods). High values of imply a high probability that the container contains illegal goods.
The following variables are defined: gives the container weight at the origin, gives the weight of the illegal goods, and gives the weight of stolen freight. These variables were generated using the MATLAB “rand” function for uniformly distributed random numbers. In our case study, we assume a maximum container load of , then:
Then, is uniformly distributed between to define weight of stolen goods in the containers of our case study.
The relationship between the and values helps simulate the behavior of the weight readings of the weight sensors, , as follows:
It can be appreciated that in our research, the smuggling and stolen freight events do not occur at the same time. That is, a container could not have been stolen from and contain illegal goods at the same time.
The illegal and legal goods were obtained by generating two ancillary variables using the MATLAB “rand” function with limits between 0 and 1 to obtain uniformly distributed numbers.
is the set of legal goods, which are liquors, fuels, tobacco, medications, weapons, raw materials, textiles, food, manufactured goods and vehicles. To define set G, the ancillary variable range is divided into 10 equal parts.
is the set of illegal goods. The range of the ancillary variable is divided into 6 equal parts. The value indices vary from 1 to 5 for liquors, fuels, tobacco, medications, weapons and 11 for illegal drugs.
Figure 20 shows all the different types of (legal and illegal) goods. Each index in the graphic represents the type of goods. The legal goods are represented by blue: 1 (liquors), 2 (fuels), 3 (tobacco), 4 (medications), 5 (weapons), 6 (raw material), 7 (textiles), 8 (foods) and 9 (manufactured products). The illegal goods are represented by yellow: 1 (liquors), 2 (fuels), 3 (tobacco), 4 (medications), 5 (weapons) and 11 (illegal drugs). Additionally, Table 10 defines the types of goods considered for our case study.
|Goods Transported||Classification of Goods||Classification of Goods According to Type|
|1 liquors||Legal or illegal||Vodka, whiskey, beer, rum, etc.|
|2 fuels||Legal or illegal||Oil, gasoline, diesel, kerosene, etc.|
|3 tobacco||Legal or illegal||Cigarettes, cigars, etc.|
|4 medications||Legal or illegal||Prescription medicines, legal drugs, natural medicines, etc.|
|5 weapons||Legal or illegal||Firearms, ammunition, bladed weapons, etc.|
|6 raw material||Legal||Vegetable, animal, mineral, liquid or fossil.|
|7 textiles||Legal||Different types of cloth, clothes, etc.|
|8 foods||Legal||Vegetables and animals.|
|9 manufacturedproducts||Legal||Consumer goods, capital goods and materials and supplies.|
|11 illegal drugs||Illegal||Cocaine, ecstasy, amphetamines, etc.|
The types of goods used in the case study.
The data generation algorithm assumes that each container only transports one type of goods, and in the case that the container contains illegal goods, it is only one type of illegal goods.
In our case study, we selected X-ray technology as the nonintrusive inspection method among the current existing technological alternatives. X-ray imaging is one of the main nonintrusive technologies for container inspection, and it provides convincing details of the content of large objects such as containers , to determine the behaviors of both the X-ray scanner results and the operator. The proposed simulation emulates the behavior of an operator at the moment that an X-ray scan is performed, that is, the operator will see and analyze the data on the container contents, for example, the volume, shape, weight and type of material that it transports. This simulation uses the (X-ray) variable, which depends on several factors:
The volume factor is given by the following equation:
The shape factor is determined by comparing the shape of the transported goods and the shape of illegal merchandise , where it will equal 1 if the shape of is similar or equal to that of and 0 otherwise.
The following is the weight factor given by
The material factor is expressed by the following equation:
Two X-ray energy levels were applied (6 MeV and 10 MeV). Using this property, we can classify the contents of a container based on the image provided by the ratio of the different levels of attenuation .
The authors wish to acknowledge the financial support of project “Estrategias de diseño microelectronico para IOT en escenarios hostiles” (Ref. TEC2016-80396-C2-2-R) funded by the Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad (Convocatoria, 2016) for the completion of this work.
Cite this article
TY - JOUR AU - Leonela Morales AU - Luis Onieva AU - Ventura Pérez AU - Pablo Cortés PY - 2020 DA - 2020/05 TI - Using Fuzzy Logic Algorithms and Growing Hierarchical Self-Organizing Maps to Define Efficient Security Inspection Strategies in a Container Terminal JO - International Journal of Computational Intelligence Systems SN - 1875-6883 UR - https://doi.org/10.2991/ijcis.d.200430.001 DO - https://doi.org/10.2991/ijcis.d.200430.001 ID - Morales2020 ER -