Journal of Robotics, Networking and Artificial Life

Volume 8, Issue 3, December 2021, Pages 211 - 217

Selection of Optimal Error Recovery Process using Evaluation Standards in Automated Plants

Akira Nakamura1, *, Natsuki Yamanobe2, Ixchel G. Ramirez-Alpizar2, Kensuke Harada2, 3, Yukiyasu Domae2
1Department of Information Systems, Faculty of Engineering, Saitama Institute of Technology, 1690 Fusaiji, Fukaya, Saitama 369-0293, Japan
2Industrial Cyber Physical System Research Center, National Institute of Advanced Industrial Science and Technology (AIST) Second Annex, AIST Tokyo Waterfront, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064 Japan
3Robotic Manipulation Research Group Systems Innovation Department, Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka 560-8531, Japan
*Corresponding author. Email:
Corresponding Author
Akira Nakamura
Received 31 October 2020, Accepted 31 July 2021, Available Online 9 October 2021.
DOI to use a DOI?
Error recovery; task stratification; error classification; automation plant

Plant automation has become increasingly popular in various industries. However, errors are more likely to occur in difficult tasks that are often performed in an automated plant. Such tasks are often returned to the previous step, and re-executed in the event of a large-scale error. Therefore, it is important to decide both the past step to which the task needs to return and the recovery step following its return. In this study, various evaluation standards are used to realize the planning of error recovery, while considering these two factors.

© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (


Automated plants are used for production in various industries, thereby increasing the production efficiency. However, automation of the system also leads to occurrence of errors. Therefore, recently, it has become important to develop techniques for error recovery; thus, research on error recovery has been actively conducted [15]. However, the recovery techniques that have been considered in these studies are ad hoc and difficult to apply to real-world plants. Therefore, it is necessary to develop versatile and systematic error recovery technologies.

We conducted research on the systematization of the error recovery theory for several years. We proposed a new error recovery method based on the sorting concepts of both task stratification and error classification [68]. As shown in Figure 1, the main segment of this method comprises a series of fundamental elements: sensing, modeling, planning, and execution. If an error occurred in the main segment, the process moves to the recovery portion. Subsequently, the cause of the errors is estimated, the errors are classified based on the results, the system is modified to be less prone to errors, and the process is rerun using the modified system with an improved reliability for achieving tasks.

Figure 1

Automated plant system with an error recovery function.

We have focused on determining the past step to which the process should revert to as well as the recovery process following the return. Consequently, we proposed a planning method for error recovery, which is derived from the decision of these two factors, considering the cost incurred during task execution [8]. However, in this study, we have proposed a planning method for error recovery considering various evaluation standards, and not only the cost.

Section 2 explains the concept of skills that are motion primitives. Section 3 describes the error recovery techniques that we have proposed thus far. Section 4 discusses the proposed method for recovery planning using multiple evaluation standards, and Section 5 provides a simple sample that is used to examine the influence of evaluation standards on selecting a recovery process. Finally, Section 6 presents the conclusion.


2.1. Skill Primitives

By analyzing a person performing a task, we derived a sequence of primitives for various behaviors. The behavioral primitives are called “skills” in our research [911]. As shown in Figure 2, in the assembly sequence, three skills, i.e., move-to-touch, rotate-to-level, and rotate-to-insert, played an important role. A skill sequence of actions for a task was obtained by assembling these three skills and their derivatives as a working sequence. The primitives of the work sequences of machines in a plant can be considered in the same manner as the skill sequences obtained from the analysis of human actions.

Figure 2

Three fundamental skill. (a) Move-to-touch skill. (b) Rotate-to-level skill. (c) Rotate-to-insert skill.

2.2. Stratification of Tasks

For tasks in an automation plant, representation using stratification makes it easier to consider work sequences, as shown in Figure 3 [68]. Please refer to Nakamura et al. [6] for more information on this technique.

Figure 3

Hierarchy of tasks.


Errors that occur during the execution of tasks in automation plants are issues that cannot be ignored. This section describes a method that derives the recovery process using error classifications [68].

3.1. Error Classification

The errors are classified into the following four types according to the cause of occurrence: execution, planning, modeling, or sensing. As shown in Figure 4, the cause of the error is generally estimated at the time of executing a task [68].

3.2. Error Recovery based on Classification

Immediately following the occurrence of an error, the first step involves estimating the cause of the error, and the second step involves proceeding to the error recovery course based on the specified cause [68]. The corresponding system parameters are modified in the indicated error recovery course, as shown in Figure 4. This improves the system, thus making it less prone to the same errors.

Figure 4

Fundamental process flow with error recovery.

When the error is small, the process returns to an earlier step in the executed sequence, and the task sequence advances again, as shown in Figures 4 and 5. Conversely, when the error is large, the process returns to the steps in the upper hierarchy composing the task, and the recovery process then proceeds, as shown in Figure 5.

Figure 5

The expression of task stratification and the process flow of the error recovery.

3.3. Candidate Processes for Recovery

This subsection briefly describes the recovery process after an error occurs during task execution. It has been assumed that task T of start S and goal G comprises n subtasks {subtask1, subtask2, …, subtaskn}, and that an error occurs in subtaskm (m is one of 1, 2, …, n), as shown in Figure 6. Here, subtaskm is the minimum possible traceable unit, as described in Nakamura et al. [6]. This is the smallest unit of the skill primitive sequence, which starts with the first node that is required to go backward when an error occurs. That is, a minimum traceable unit represents a boundary, where recovery can be made by going back to a certain node or a step before that node; however, recovery is not possible beyond that step.

Figure 6

Various processes of error recovery considered for a failure occurred in subtaskm.

Here, it is assumed that an error occurs at subtaskm, and the process returns to subtaskj (j is one of 1, 2, …, m). Then, the recovery process begins from subtaskj, and returns to the original work sequence. However, it is not always possible to make the process identical to the original work sequence. Owing to various conditions, it may be necessary to use a sequence composed of different skill primitives, after an error occurs. This is why the substitute task for the subtask in the recovery process in Figure 6 is shown as “An equivalent task.”


In the previous section, we demonstrated how the recovery process after the occurrence of an error is not necessarily limited to one. In other words, it is often difficult to determine how far back one should go, and to select the appropriate type of skill sequence for the composition of the process following the return. Therefore, a method is required for selecting a suitable recovery process from the various available options.

In our previous research, we only used the concept of cost to compose the process, when considering error recovery [8]. Instead, in this paper, we have considered various evaluation standards, and selected the relevant optimum error recovery processes. The following 10 evaluation standards are considered herein.

  1. (i)


    We consider the cost of executing the recovery process as an evaluation standard. Nakamura et al. [8] considered only cost as a standard, which corresponds to item (i). Here, cost refers to material charges, parts charges, electricity bills, and other relevant process costs. Furthermore, expenses such as planning and personnel expenses may be included. Ultimately, the recovery process with the lowest total cost is prioritized.

  2. (ii)


    Let us consider the time involved in the recovery process as an evaluation standard. However, even for a single failure, decisions, such as how far to return in the previous step and where to go back in the original sequence, depend on the recovery process. Therefore, the comparison is made from the time when the failure occurs to that when the original task ends.

  3. (iii)


    We will consider the reliability associated with the execution of the recovery process as an evaluation standard. Consequently, the recovery process that ensured that the goals of the original task are fulfilled takes precedence.

  4. (iv)


    We will consider the safety of the recovery process execution as an evaluation standard. Hence, the recovery process that provides the highest level of safety to the surrounding environment during operation execution is prioritized.

    Although reliability and safety may be perceived as similar evaluation standards, there is a difference between the two. Reliability is an indicator of the degree of achievement of both the recovery process and the original goal of the task. Conversely, safety is an indicator of the risk of harm to people or damage to materials in the time period between which the failure occurs and the original task is completed.

  5. (v)


    Here, finishing refers to the appearance, state, and condition of the object at the time of the goal. We will consider finishing after the execution of the complete task, including the recovery process, as an evaluation standard. In the example of manufacturing and repair tasks for equipment and household goods, the recovery process that provides a desirable appearance to the finished product is determined to be better.

  6. (vi)

    Recovery data

    Let us consider recovery data as an evaluation standard. Specifically, the amount of data related to the error recovery process is used as an indicator, and the process with abundant data is preferred for use. This indicator will become ordinary once data-driven AI techniques start being used for error recovery.

  7. (vii)


    We will consider the parts used for error recovery as an evaluation standard. Depending upon the type of failure, it is sometimes more feasible to replace the parts. In such cases, when alternative parts were readily available, the recovery process of implementing part replacement takes precedence.

  8. (viii)


    We will consider the tools used in error recovery as an evaluation standard. If there is a dedicated tool to return from an error state to a normal state, it is deemed more suitable to use the same. In such a case, the process of recovering with tools takes precedence.

  9. (ix)


    Let us consider the workspace used for error recovery as an evaluation standard. In many cases, the feasibility of the recovery process is determined by whether there is space to perform the recovery operation. When a workspace for performing error recovery is secured, the process that utilized the relevant workspace is prioritized depending upon the type of failure.

  10. (x)

    Operator skill

    Let us consider the skills of the operator, who performs error recovery, as an evaluation standard. If there is an operator that is efficient in executing a certain recovery operation, the recovery process that included the concerned operator is prioritized.


In this section, we will observe the changes in the error recovery process that are prioritized by different chosen evaluation standards using a simple example. The error recovery process is derived and compared based on a single evaluation standard, which is selected from the 10 types of evaluation standards that are described in Section 4.

5.1. Several Types of Recovery Processes

Let us consider attaching a hook to a vertical flat plate with four precision screws. It is assumed that the vertical flat plate has four tapped holes for these precision screws. The task comprised various skill primitives, as shown in Figure 7; however, these primitives can be divided into four sections. As shown in Figure 7a–7d, the first section is a tacking task that temporarily fastens four screws to a horizontally laid hook. As shown in Figure 7e and 7f, the second section is an erecting task, wherein a horizontally inverted hook is made to stand vertically. As shown in Figure 7g and 7h, the third section is a touching task for moving the hook to the mounting position on the vertical plane plate. As shown in Figure 7i–7m, the fourth section is an installation task for fixing the hook to the vertical plane plate by tightening the four screws.

Figure 7

An assembly task in which a hook is stuck to a plate by four precision screws.

Now, let us consider the recovery processes for the failure in the step shown in Figure 7h, where one screw that was temporarily fixed comes off and falls (Figure 8). Here, we have discussed three types of recovery processes.

  • Recovery Type I (RT-I)

    The first recovery type (RT-I) is the process of returning to starting step S of the original task and starting over (Figure 9). Concurrently, the hook with three screws and the one screw that fell off when the error occurred are abandoned, and the task is rerun with new parts, that is, a new hook and four new screws.

  • Recovery Type II (RT-II)

    The second recovery type (RT-II) is the process of returning to the step that is shown in Figure 7d in the tracking task and starting over from this step, as shown in Figure 10. After returning to the tracking task, a screw is temporarily fastened at the missing hole once again; consequently, two types of methods occurred depending upon the screw used at said time. One involves abandoning the fallen screw and using a new one in the parts box; conversely, the other involves finding the fallen screw and using it again. The former is called process [RT-II(N)], and the latter is called process [RT-II(F)].

  • Recovery Type III (RT-III)

    The third recovery type (RT-III) is the process of returning to the skill primitive in subtaskm, where the failure occurred and started over from, as shown in Figure 11. This method has a shorter return process; however, unlike (RT-II) in the vertical direction, it requires the insertion and temporary fastening of the screws in the horizontal direction. Mistakes are likely to occur in the temporary fixing process, owing to which the difficulty level is high. As in the case of (RT- II), there are two schemes depending upon the screw used in the relevant step. The method of discarding the fallen screw and using a new screw in the parts box is called process [RT-III(N)]; meanwhile, the method of finding the fallen screw and reusing it is called process (RT-III(F)]. Immediately after an error occurs, the first step is to estimate the cause of the error, and the second is to proceed to the error recovery course based on the specified cause.

Figure 8

An error in which a screw is dropped at (h) in Figure 7.

Figure 9

Recovery Type I (RT-I).

Figure 10

Recovery Type II (RT-II).

Figure 11

Recovery Type III (RT-III).

5.2. Suitable Process in Each Evaluation Standard

In the previous subsection, we considered five types of error recovery processes—(RT-I), [RT-II(N)], [RT-II(F)], [RT-III(N)], and [RT-III(F)]—for screw fallout errors in the task of fixing a hook to a vertical flat plate with four precision screws. Here, the priorities of these five recovery processes have been derived for each evaluation standard that is described in Section 4. For simplicity, priorities are explained qualitatively, rather than quantitatively.

  1. (i)


    Priority order = {(RT-II(N)), (RT-III(N)), (RT-II(F)), (RT-III(F)), and (RT-I)}

    The priorities of the five error recovery types under the cost based evaluation standard have been described in detail in our previous research [8]. Here, material charges, parts charges, electricity bills, and planning expenses are considered as costs. The (RT-I) process is costly, as it discards the entire unit in which the error occurs and starts over from the beginning. In contrast, the [RT-II(N)] process is not costly, as there is no search operation or difficult operation involved.

  2. (ii)


    Priority order = {[RT-III(N)], [RT-II(N)], (RT-I), [RT-III(F)], and [RT-II(F)]}

    The (RT-II(F)) process requires a search operation, and must return to much earlier steps, which is time consuming and unsuitable. Alternatively, the [RT-III(N)] process does not involve a search operation, and returns only a few steps; hence, it is the optimum choice.

  3. (iii)


    Priority order = {(RT-I), [RT-II(N)], [RT-III(N)], [RT-II(F)], and [RT-III(F)]}

    The (RT-I) process is the most reliable process, as it runs the original sequence from the beginning.

    The [RT-II(N)] and [RT-III(N)] processes are the second most reliable after the (RT-I) process, because they do not have search operations and only perform almost planned operations. Conversely, the [RT-II(F)] and [RT-III(F)] processes are not suitable in terms of reliability, because they require search operations. Moreover, [RT-II(N)] and [RT-II(F)] are more suitable than [RT-III(N)] and [RT-III(F)], respectively, because there are no difficult operations involved.

  4. (iv)


    Priority order = {(RT-I), [RT-II(N)], [RT-II(F)], [RT-III(N)], and [RT-III(F)]}

    The (RT-I) process is the most desirable in terms of safety, because it executes the original sequence from the beginning. The [RT-II(N)] and [RT-II(F)] processes are the second safest options, because they do not have difficult operations or become unstable postures. In contrast, the [RT-III(N)] and [RT-III(F)] processes are not suitable, because they require difficult operations. Additionally, [RT-II(N)] and [RT-III(N)] are more suitable than [RT-II(F)] and [RT- III(F)], respectively, because they do not involve a search process.

  5. (v)


    Priority order = {Fundamentally the same for all five} There is no significant difference in the completion of the sample task for any of the processes. However, operations that may adversely affect the target item or its surroundings, such as scratching or scattering, should be avoided.

  6. (vi)

    Recovery data

    Priority order = {Process with a significant amount of data is prioritized}

    The quantity of the dataset is important in deriving the error recovery process, using data science techniques. Even when the operation is considerably difficult to perform, considerable amounts of data allow the task to be accomplished successfully.

  7. (vii)


    Priority order = {Process of using dedicated parts is given priority}

    If dedicated parts are prepared for replacement, it is desirable to prioritize the process of using those parts. For example, suppose there is a special replacement unit that corresponds to a hook temporarily fastened with four screws, as shown in Figure 7a–7d. When an error occurs, the use of the replacement unit simplifies the recovery task.

  8. (viii)


    Priority order = {Preference is given to a process with a dedicated tool}

    Here, we will consider a case in which there is a dedicated tool for recovery. If a process that uses the exclusive recovery tool is incorporated, it is highly likely that the recovery will be quick and reliable. If there is a special tool that can temporarily fasten the hook with four precision screws in any posture, the [RT-III(N)] or [RT-III(F)] process is given the highest priority.

  9. (ix)


    Priority order = {Process of using the workspace takes precedence}

    If there is a dedicated space prepared for replacement work, it is desirable that the process that utilizes this area be prioritized. In this assembly task, if there is a space for replacement work near the place where the failure occurs, the process that uses that region will be given priority. For example, in the RT-II process, it is efficient to use a dedicated workspace for the process of temporarily fixing the screws, as shown in Figure 10d, e, r.

  10. (x)

    Operator skill

    Priority order = {Efficient recovery process of the operator is prioritized}

    We will consider a recovery process using a teaching operator. It is preferable that a process, which is the forte of an operator regarding recovery implementation, is prioritized. However, there may be differences in the level of efficiency of each operator. Hence, the processes suitable for skilled operators may differ from those suitable for unskilled operators.

We have examined the appropriate process by selecting only one out of the 10 evaluation standards at a time. In practice, however, it may be desirable to derive the optimal process using a combined evaluation of multiple standards.


If an error occurs during execution of a task in an automated plant, the process that executes the main portion moves to the recovery portion. In general, it is possible to obtain various candidates for this recovery process. In this study, we have proposed a method for selecting optimal candidate based on a type of evaluation standard. Furthermore, 10 different types of evaluation standards have been specifically considered. However, the choice of indicators was left to the operator. As a specific example, we considered the recovery process in case of the failure of a temporarily fastened screw falling out during the hook installation process.

In this study, for simplicity, the recovery process was selected individually for each type of evaluation standard. However, it is often desirable to make selections using multiple types of evaluation standards, simultaneously. A future study will determine the error recovery process according to multiple standards.


The authors declare they have no conflicts of interest.


Prof. Akira Nakamura

He received the PhD degree in Electrical Engineering from Keio University in 1991. From 2021, he has been working as a Professor at Faculty of Engineering of Saitama Institute of Technology. His research interests include robot planning, vision and control system.

Dr. Natsuki Yamanobe

She received the PhD degree from the University of Tokyo in 2007. She is currently a Senior Research Scientist with Artificial Intelligence Research Center of Advanced Industrial Science and Technology (AIST).

Dr. Ixchel G. Ramirez-Alpiar

She received her PhD degree in Mechanical Engineering from the Graduate School of Engineering, Osaka University, Japan, in 2013. Since 2019, she has been working as a senior researcher at the Artificial Intelligence Research Center of AIST. Her research interests include robot learning from demonstration, robotic manipulation and human–robot collaboration. She is a member of IEEE and RAS.

Prof. Kensuke Harada

He received his Doctoral degrees in Mechanical Engineering from Kyoto University in 1997. From 2016, he has been working as a Professor at Graduate School of Engineering Science, Osaka University.

Dr. Kazuyuki Nagata

He received his PhD degree from Hokkaido University in 2014. He is currently a team leader with Artificial Intelligent Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST).


Journal of Robotics, Networking and Artificial Life
8 - 3
211 - 217
Publication Date
ISSN (Online)
ISSN (Print)
DOI to use a DOI?
© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (

Cite this article

AU  - Akira Nakamura
AU  - Natsuki Yamanobe
AU  - Ixchel G. Ramirez-Alpizar
AU  - Kensuke Harada
AU  - Yukiyasu Domae
PY  - 2021
DA  - 2021/10/09
TI  - Selection of Optimal Error Recovery Process using Evaluation Standards in Automated Plants
JO  - Journal of Robotics, Networking and Artificial Life
SP  - 211
EP  - 217
VL  - 8
IS  - 3
SN  - 2352-6386
UR  -
DO  -
ID  - Nakamura2021
ER  -