Journal of Robotics, Networking and Artificial Life

Volume 8, Issue 4, March 2022, Pages 278 - 283

Fuzzy Theory-based Air Valve Control for Auto-Score-Recognition Soprano Recorder Machines

Authors
Chun-Chieh Wang*
Department of Automation Engineering and Institute of Mechatronoptic Systems, Chienkuo Technology University, Taiwan
Corresponding Author
Chun-Chieh Wang
Received 6 November 2020, Accepted 22 September 2021, Available Online 27 December 2021.
DOI
10.2991/jrnal.k.211108.010How to use a DOI?
Keywords
Auto-score-recognition soprano recorder machines (ASRSRM); LabVIEW; Arduino; fuzzy theory-based air valve control; pneumatic cylinder
Abstract

In the past research, there are many disadvantages to score recognition and flute performance. In view of this, we will improve the above disadvantages in this article. First, for the music score recognition, a y-axis projection method is used to detect the staff position and eliminate it to replace the erosion and expansion in morphology. This feature can be used to distinguish the notes, which have a specific writing style on the staff. For the soprano recorder playing, in the past we used finger-shaped electric arms to press the blow hole to cause that the speed of the score cannot be kept up. To improve this drawback, the motor is changed to a solenoid valve to facilitate the pneumatic cylinder to smoothly press the blow hole. In addition, since the difference in pitch of the soprano recorder requires different air pressure, we increase one valve to three valves. Moreover, the range is divided into bass, midrange, and treble. Not only that, fuzzy theory-based air valve control is applied to auto-score-recognition soprano recorder machines to greatly improve the sound distortion caused by the original single air valve. Experiments prove that the fuzzy theory-based air valve control combined with sheet music recognition techniques can fully realize the functions of autoplaying soprano recorder machines.

Copyright
© 2021 The Author. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Nowadays, robots are no longer just used in industrial production, and they can also be seen in medical and artistic fields. The International Robotic Art Competition, which started in 2016, has also been held three times. Many works created by robots have reached a level comparable to human artists. As for performance robots, there have been significant advances and improvements due to artificial intelligence. Among them, the research on music score recognition technology has been proposed in many documents. Although the music notation is limited and there are restrictions on the writing position on the staff. However, there are still many blind spots to be overcome and improved in identifying it with artificial intelligence technology, even if it is only for non-handwritten scores.

Lee et al. [1] proposed a two-wheeled robot that can autonomously read music and sing songs with vocal voice. The disadvantage is that the recognized scores are only digital scores. Hong [2] proposed mixed media to draw portraits with hand-drawn style. The input image undergoes different image processing steps to extract these features and combine them to form an Non-Photorealistic Rendering (NPR)-style image. Due to only FIVE colors are used to mix and modulate the colors, the error in color toning is about 10%. Chang et al. [3] use optical symbol recognition. Since it must build a symbol database, it is possible to recognize errors as long as there are slight errors. Xiao et al. [4] presented a real-time optical music recognition system for a dulcimer musical robot. Since the note groups are only decomposed into three fundamental elements, it is easy to cause inaccurate recognition of certain notes. Wang and Chen [5] proposed a musical score recognition system for iOS devices, because its recognition process requires human assistance, so there is no way to achieve the effect of automatic recognition.

Based on the shortcomings of the above-mentioned scholars, this article proposes fuzzy theory-based air valve control combined with sheet music recognition techniques to improve the recognition of music scores and improve the phenomenon of out of sound. Experiments prove that presented method can fully realize the functions of Auto-Score-Recognition Soprano Recorder Machines (ASRSRM).

2. ASRSRM

To improve the problem that the finger-shaped robotic arms cannot keep up with the rhythm of the music, all motors are changed to a solenoid valve and combined with the Arduino UNO board to control the air cylinder to press the blow hole. The finished ASRSRM is shown in Figure 1.

Figure 1

Finished product of the ASRSRM.

2.1. Control Systems

The Arduino UNO board is used as the control core of the ASRSRM. LINX in LabVIEW Graphical Programming environment is the bridge between LabVIEW program and Arduino UNO board. Like LabVIEW Interface for Arduino (LIFA), LINX must first write a set of communicable commands in Arduino. The difference is that LIFA can provide users with the ability to directly use Arduino connection operation in LabVIEW without writing Arduino code. For other details, please refer to Wang and Jhang [6]. The control system architecture diagram is shown in Figure 2. Figure 3 is the Arduino UNO control board.

Figure 2

The control system architecture diagram.

Figure 3

Arduino UNO.

2.2. ASRSRM Platform

The physical architecture of the ASRSRM platform is shown in Figure 4. The main feature is that the part of the original flute that presses the blow hole is pressed by the solenoid valve controlled pneumatic cylinder. The improved machine improved the original inability to play music faster than the response speed of the motor significantly. Not only that, the blowing control part has two more solenoid valves. It solves the problem of broken sound when blowing. To make blowing more smooth and close to the mode of human playing the flute, we have increased the original three solenoid valves to nine in the new ASRSRM platform, as shown in Figure 5.

Figure 4

The physical architecture of the ASRSRM platform.

Figure 5

Nine solenoid valves

3. MUSIC SCORE RECOGNITION SYSTEM

First, we eliminate the parts of the score that are not related to performance, as shown in Figure 6. The processed music score is input into the system. After that, binarization and scale recognition are performed through LabVIEW. Figure 7 is its program flow chart.

Figure 6

Score preprocessing.

Figure 7

Program flow chart.

First, the note stems are distinguished and eliminated by the x-axis projection detection method. Second, the center point of the head of the note and the position of the staff are identified by the pixel clustering method. Finally, the encoding of the scale is arranged in sequence from bass to treble. For other details, please refer to Wang and Jhang [6].

3.1. Binarization

Binarization is to divide the grayscale image into black and white according to the threshold set by the user. When the grayscale value of the pixel is greater than the threshold, it is judged as a white point, otherwise it is a black point. The grayscale image can be converted into a binary image through binarization. For other details, please refer to Wang and Jhang [6].

3.2. Eliminate Staff

We use the y-projection in the orthogonal projection method to project the music score to be identified onto a single axis, which will form a graph called the projection profile, as shown in Figure 8. We can clearly find the position of the staff and eliminate it. This method greatly improves the original use of erosion and swelling in typology to cause unclear symbols, as shown in Figure 9.

Figure 8

The process of removing the staff. (a) Original staff. (b) The result after y-axis projection. (c) The result after removing the staff.

Figure 9

The process of removing the staff (using the previous method). (a) Original image (using the previous method). (b) The result after removing the staff (using the previous method).

3.3. Symbol Distinction

For the introduction of rests, please refer to Wang and Jhang [6]. Generally, the height of most rests is not greater than the height of the notes. Therefore, the notes are preliminarily divided into rest notes and general notes to facilitate subsequent identification in this paper. To distinguish between continuous notes and discontinuous notes, the image is projected on the x-axis using the x-projection in the orthogonal projection method. The note stems exceeding a certain value are eliminated, as shown in Figure 10. Besides, the pixel clustering method will be used to distinguish the following three types: discontinuous notes, continuous notes, and discontinuous notes but with tails, as shown in Figure 11.

Figure 10

The process of removing note stems. (a) The result after removing the staff. (b) The result after x-axis projection. (c) The result after removing note stems.

Figure 11

Comparison chart after removing note stems. (a) Discontinuous notes. (b) Continuous notes. (c) Discontinuous notes but with tails.

The pixel clustering method uses the IMAQ Count Objects 2 VI component in the Vision development module in LabVIEW. The function of this component is to cluster the binarized pixels. Let the user set the pixel threshold to distinguish the number of pixel groups. Figure 12 is the result of pixel cluster identification.

Figure 12

The result of pixel cluster identification.

3.4. Scale Recognition

While the scale recognition is performed, the note timing is also recognized. Therefore, the head and tail of the note must be distinguished first. First, the discontinuous notes have been separated out after removing the note stem, so there is no need to deal with it. The part with discontinuous notes but with tails is distinguished by the proportion of black pixels in the image extracted by pixel clustering. The pixel ratio calculation is based on the range enclosed by the red frame to calculate the image size and the ratio of black pixels, as shown in Figure 13. Continuous notes cannot be identified by this method because the proportions of black pixels of note stems and note heads are too close. Therefore, we use the aforementioned note stems to account for more than two-third of the overall note width to distinguish. The position of the beam will change due to the way the music theory is written. So we divide the continuous note from the center point into the upper and lower parts, as shown in Figure 14.

Figure 13

Distinguish of discontinuous notes but with tails. (a) Note head (80%). (b) Note tail (30–40%).

Figure 14

The process of cutting continuous notes. (a) Original image with Note head at the bottom. (b) Original image with Note head at the top.

For judgment of discontinuous notes, the difference between discontinuous notes is whether the beam and the head are hollow. We use the ratio of black pixels in the red box to distinguish half notes from quarter notes, as shown in Figure 15. Because we have already recorded the position of the staff while removing the staff. Therefore, we only need to extract the center point of the talisman through the pixel clustering, and then compare it with the previously recorded staff position to know which line or interval the talisman is located on. Put it into the scale table of the recorder to get the scale of the note, as shown in Figures 16 and 17.

Figure 15

The solid and hollow of the note head. (a) Solid note head (80%). (b) Hollow note head (50%).

Figure 16

Scale table of the soprano recorder.

Figure 17

Scale press fingering table of the soprano recorder.

3.5. Rest Identification

The identification of rests is double identification using orthogonal projection method and pixel clustering. According to the appearance of commonly used rests, they are divided into three types: (1) whole rest and half rest, (2) quarter rest, (3) 8th rest and 16th rest. First, we use the pixel clustering method to capture the rest image, and use the black pixel ratio in the red frame to make a preliminary judgment, as shown in Figure 18. Second, make a detailed distinction for the above three types. (1) The image coordinates of the y-axis projection of the full rest and the bipartite rest are captured and compared with the staff position. If the image coordinates are closer to the fourth line of the staff, it is judged as a whole rest. If the coordinates are closer to the third line, it is a half rest. (2) Compare the height value captured after y-axis projection with the highest point value after x-axis projection. If the values are similar, it is judged as a quarter rest, as shown in Figure 19. (3) The 8th rest has only one peak in the y-axis projection image. The 16th rest has two peaks, as shown in Figures 20 and 21.

Figure 18

Distinguish between rests. (a) Whole rest and half rest (99%). (b) Quarter rest (50–70%).

Figure 19

Quarter rest. (a) Original image. (b) y-axis projection. (c) x-axis projection.

Figure 20

Eighth rest. (a) Original image. (b) y-axis projection. (c) x-axis projection.

Figure 21

Sixteenth rest. (a) Original image. (b) y-axis projection. (c) x-axis projection.

4. FUZZY THEORY-BASED AIR VALVE CONTROL DESIGN

To provide a smoother air pressure with different sound ranges and close to the mode of human playing the flute, this article increases the number of solenoid valves to nine. Besides, the opening and closing gap of each solenoid valve is different. In this paper, a fuzzy theory-based control law is applied to write the valve control program. The range (R) is divided into small bass (SB), medium bass (MB), high bass (HB), small midrange (SM), medium midrange (MM), high midrange (HM), small treble (ST), medium treble (MT), and high treble (HT), as shown in Figure 22. Let U be the sound range and V be solenoid air valves. The membership function of the range is designed in Mamdani type. The membership function of the output (solenoid air valves) is designed in Takagi Sugeno type. The design of Fuzzy IF-THEN rules are expressed as follows.

If U is SB, then V is V1.

If U is IB, then V is V2.

If U is HB, then V is V3.

If U is SM, then V is V4.

If U is IM, then V is V5.

If U is HM, then V is V6.

If U is ST, then V is V7.

If U is IT, then V is V8.

If U is HT, then V is V9.

Among them, the solenoid air valve opening and closing gaps of V1–V9 are calculated by the number of turns. Their values are V1 = 1.1, V2 = 1.25, V3 = 1.5, V4 = 2, V5 = 2.25, V6 = 2.5, V7 = 2.75, V8 = 2.85, V9 = 2.95, respectively.

Figure 22

The range.

5. EXPERIMENTAL RESULTS

In view of the large differences in the way of notation for tuplets (Figure 23a) and dotted notes (Figure 23b) by musicians around the world, it is impossible to effectively identify all these two symbols with a general rule. Therefore, this article will ignore it during the experiment. In addition, the score used for testing is taken from a web site made by netizens and provided free of charge [7,8]. The scores we used to test included five Mandarin pop songs, two Japanese pop songs, and three movie theme songs. Please refer to the following URL directly for the experimental results.

https://www.youtube.com/watch?v=1Z-kPIPQG-U

Figure 23

(a) Tuplets. (b) Dotted notes.

6. CONCLUSION

This paper uses the pixel clustering method, the x- and y-axis projection method to successfully improve the score recognition results. Moreover, the finger-shaped electric arms are changed to solenoid valves to facilitate the pneumatic cylinder to smoothly press the blow hole. This method has also successfully improved the phenomenon of air leakage. Not only that, fuzzy control theory is used to control the nine solenoid air valves to greatly improve the sound distortion caused by the original single air valve. Experimental results prove that fuzzy theory-based air valve control can fully realize the functions of auto-score-recognition soprano recorder machines.

CONFLICTS OF INTEREST

The author declares no conflicts of interest.

AUTHOR INTRODUCTION

Dr. Chun-Chieh Wang

He received the PhD degree in Institute of Graduate School of Engineering Science and Technology from National Yunlin University of Science & Technology, Yunlin, Taiwan, in 2004. He is currently a Professor of Department of Automation Engineering and Institute of Mechatronoptic Systems of Chienkuo Technology University. His areas of research interest include robotics, image detection, electromechanical integration, innovative inventions, long-term care aids, and application of control theory. He is now a permanent member of Chinese Automatic Control Society (CACS). He is also a permanent member of Taiwan Society of Robotics (TSR).

Journal
Journal of Robotics, Networking and Artificial Life
Volume-Issue
8 - 4
Pages
278 - 283
Publication Date
2021/12/27
ISSN (Online)
2352-6386
ISSN (Print)
2405-9021
DOI
10.2991/jrnal.k.211108.010How to use a DOI?
Copyright
© 2021 The Author. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Chun-Chieh Wang
PY  - 2021
DA  - 2021/12/27
TI  - Fuzzy Theory-based Air Valve Control for Auto-Score-Recognition Soprano Recorder Machines
JO  - Journal of Robotics, Networking and Artificial Life
SP  - 278
EP  - 283
VL  - 8
IS  - 4
SN  - 2352-6386
UR  - https://doi.org/10.2991/jrnal.k.211108.010
DO  - 10.2991/jrnal.k.211108.010
ID  - Wang2021
ER  -