Journal of Robotics, Networking and Artificial Life

Volume 8, Issue 2, September 2021, Pages 99 - 103

Performance Information Editing System for Player Piano Aiming at Human-like Performance

Authors
Ryo Kinoshita*, Eiji Hayashi
Department of Intelligent and Control Systems, Kyushu Institute of Technology, Hayashi Lab, 680-4, kawazu, Iizuka-City, Fukuoka, 820-8502, Japan
*Corresponding author. Email: kinoshita@mmcs.mse.kyutech.ac.jp
Corresponding Author
Ryo Kinoshita
Received 25 November 2020, Accepted 10 May 2021, Available Online 30 July 2021.
DOI
10.2991/jrnal.k.210713.006How to use a DOI?
Keywords
Automatic piano; knowledge database; computer music; music interface
Abstract

We have developed a system that enables automatic piano performance. To make this piano play like a living pianist, we have to add expression to the performance. Accordingly, we have developed a musical editing system that utilizes a database to edit music more efficiently.

Copyright
© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

We have developed a system that enables automatic piano performance [1]. Eighty-eight actuators are attached to a grand piano’s keys, and two actuators are attached to the pedals. These actuators execute keystrokes and pedal movements to govern the piano’s performance, e.g., “Figure 1” [2].

Figure 1

The automatic piano.

In our research, we are trying to bring human-like skill and expression to player piano performance. Playing the piano expressive by player piano, adjusting the volume, length, and music timing is necessary. We aimed to develop a system that can infer performance based on skilled pianist’s features. Therefore, we developed a system that can estimates the performance expression of unedited music automatically by using edited performance data and score data. In this paper, we described a phrase search using Dynamic Programming (DP) matching [3]. In addition, we mention a method for selecting an optimal phrase and how to infer parameters of notes. Finally, we evaluated an inferred entire song played by both-handed.

2. EDITING SUPPORT SYSTEM

2.1. Performance Information

The automatic piano that we have developed uses a musical data structure similar to musical instrument digital interface. The editing support system edits four parameters contained in a tone: “Velo” (velocity), “Gate,” “Step”, and “Time”. “Velo” is the volume of a sound, given as a value from 1 to 127. “Gate” is the duration of the note in milliseconds. “Step” is the interval of time between notes. “Time” indicates when the sound will start after the music started. “Step” and “Time” units are also milliseconds.

2.2. How to Make the Data for Player Piano

The structure of the editing support system for making the data for player piano is shown in Figure 2. The system extracts a pianist’s features from performance data that the pianist has played. Next, the performance data in the unperformed song is inferred from the pianist’s features and score information of a musical piece that the pianist has never played before.

Figure 2

Editing system structure.

2.3. Search System

The results of the analysis showed that phrases with the same pattern that exists in the same song are played with similar expressions (Figure 3) [4]. Equally, it was predicted that the same pattern’s expression would be the same at other songs. Therefore, DP matching was adopted, which allows us to search for similar phrases for arbitrarily determining search phrases.

Figure 3

The discovery about the relation of same tunes and similar expressions.

Dynamic programming matching is a technique used widely in the field of speech recognition, bioinformatics, and so on. It has a feature that can calculate the similarity between two words that are different in a number of characters from each other. In Figure 4, the minimum cost route in each point is taken, and the route with the lowest cost is finally assumed to be the optimal path. The cost at that time is defined as the distance between patterns. In this system, this distance is handled as a threshold to judge whether the phrases are similar.

Figure 4

Example of DP matching method.

2.4. Select System

After done the Search system, there are many phrases which has the same DP matching points. Such phrases are called similar phrases. An objective phrase is a phrase for which inference will be made. In addition, an optimal phrase is a phrase that is considered to be the best fit for an objective phrase among similar phrases. It is required to select the optimal phrases for objective phrases. Because of this, the select system is used to determine an optimal phrase. In this system, five indicators are used from the viewpoint of music theory: Dynamic symbols, Beats and Steps, Similarity in changes of the interval, Dynamic symbols and Velo, and Musical Forms.

2.4.1. Dynamic symbols

If the Dynamic symbols are different between a similar phrase and an objective phrase, the performance is affected even if the phrase is the same. Then, the search phrase and the similar phrase that matches the dynamics on the score are preferentially selected.

2.4.2. Beat and step

Based on the musical grammar, it is known that strong beats are closely related to Step [5]. In places considered to be strong beats, similar phrases are selected using the property that the Step value is larger than in other places. The position of the strong beat depends on the rhythms.

2.4.3. Similarity in changes of the interval

If the interval changes of the phrases are similar, the performance expression will also be similar. Thus, the system selects phrases with high similarity in a change of interval.

2.4.4. Dynamic symbols and Velo

It is likely that there is a trend in the Velo values depending on the dynamic symbol. Hence, the range of Velo values for each dynamic symbol is examined from performance data. Next, the system selects a phrase in which the Velo value corresponds to the dynamic symbol of the search phrase.

2.4.5. Musical forms

Finally, a selection is conducted using musical forms because it was predicted that the same part has a similar expression.

2.4.6. How to select left-hand phrases

When selecting left-hand phrases, the system narrows down phrases by phrase category before making a selection based on the five indicators. It is due to the same category likely have a similar expression. Left-hand phrases can divide four categories: “Main theme”, “Broken chord”, “Single”, and “Chord”. If there are no similar phrases in the same category, the system selects an optimal phrase from other categories.

2.5. Infer System

If the expression of an optimal phrase is used directly to a search phrase, it will be an unnatural expression. Thus, this system infers appropriate “Step”, “Velo”, and “Gate” values for the search phrase from the optimal phrase.

2.5.1. Data used for inference

For proper inference, investigation from performance data is necessary. This year, data for the four songs shown in Table 1. were used. All songs were composed by W.A. Mozart and performed by Maria Joao Pires.

Music title
Piano Sonata No.11 in A major, K.311 3rd movement “Turkish March”
Piano Sonata No.15 in C major, K.545 1st movement “Allegro”
Piano Sonata No.15 in C major, K.545 2nd movement “Andante”
Piano Sonata No.15 in C major, K.545 3rd movement “Rondo”
Table 1

Music title of performance data

2.5.2. Inference about step

The tempo investigation and the calculated normalization factor using the music data in Table 1 are shown in Table 2. “Normalization Factor” is a number that “All Tempo Average” divide by “Tempo Average”.

Music number
Tempo average 0.80 0.82 1.00 0.92
All tempo average 0.89
Normalization factor 1.11 1.07 0.89 0.96
Table 2

Investigate results about tempo

The inference equation for Step is Equation (1). “PStep” represents the provisional Step value which calculates by optimal phrase and “NF” represents the normalization factor.

Step=PStep×NF×All Tempo Average (1)

When inferring left-hand phrases, adjust the timing based on the results of right-hand inference in the case of the note is the same timing in the musical score.

2.5.3. Inference about Velo

The average Velo value’s investigation using the music data in Table 1 is shown in Table 3.

Dynamic symbol Average (right-hand) Average (left-hand)
p 64 44
mf 62 36
f 64 45
Table 3

Average of Velo value for each Dynamic symbol

Similar interval phrases are those phrases that have the highest similarity of interval change among similar phrases. A similar interval phrase is used to infer the Velo value.

The inference equations for Velo are Equations (24). “n” represents the number of notes in an objective phrase and “SVelo” represents a similar interval phrase’s value of Velo. Equation (2) is used when the first note in the phrase and the dynamic symbol is different from the previous note. Depending on the results of DP matching, the search phrase may have to be split. In this case, the first note of the split phrase is calculated by Equation (3) except in the case of Equation (2). Other notes are calculated by Equation (4).

Velo(n)=average in dynamics symbol (2)
Velo(n)=Velo(n1) (3)
Velo(n)=Velo(n1)SVelos value (4)

2.5.4. Inference about gate

The investigation of factors for each musical symbol that gives an effect on the length of a note is shown in Table 4 by using music data in Table 1.

Musical symbols staccato none slur
Factor 0.4 0.9 1
Table 4

Factors of musical symbols affect the note length

The inference equation for Gate is Equation (5).

Gate=Inferred step×factor (5)

3. EXPERIMENT

An experiment was conducted to compare the music reproduced by the editing support system with the pianist’s performance. The target music is Mozart Piano Sonata No.11 in A major, K.311 1st movement “Theme”.

The score of the part showing the inference results is shown in Figure 5. All inference results are shown as graphs. Figures 6 and 7 are results about “Step”, and Figure 6 is right-hand, Figure 7 is left-hand. Figures 8 and 9 are results about “Gate”, and Figure 8 is right-hand, Figure 9 is left-hand. Figures 10 and 11 are results about “Velo”, and Figure 10 is right-hand, Figure 11 is left-hand.

Figure 5

The score of the inferred part.

Figure 6

Compare graph about right-hand Step.

Figure 7

Compare graph about left-hand Step.

Figure 8

Compare graph about right-hand Gate.

Figure 9

Compare graph about left-hand Gate.

Figure 10

Compare graph about right-hand Velo.

Figure 11

Compare graph about left-hand Velo.

4. CONSIDERATION

We can confirm that inferred Step value is closer to the performance of the pianist than unedited data. It is common to both hands. We can also confirm the same trend in Gate. Hence, the results suggest that the infer system about Step and Gate is effective. However, the inference result for the phrase containing rests is a bit far from the performance data like numbers 13–17 in Figure 9. So, it is necessary to consider the infer method when phrases include rests. Also, the inferred value of Velo is not similar to performance data. Accordingly, infer system about Velo is needs to improve. Moreover, the current system can not take into account features such as the last phrase of a passage. It is also necessary to establish an inference method that can consider the position of phrases.

5. CONCLUSION

In the past few years, a method of using change of interval was established [6], and a select system was established. [7] This year, we introduced an inference system about left-hand and improved some systems. The inference experiment showed that we were able to infer a performance expression in an unperformed song from some performance data. The next outlook is inferring pedal information and improve inference accuracy.

The Editing system of performance information is the system that infers a phrase from similar phrases in other songs. In the case of the same phrases are repeated, the inferred result will be precisely the same. By contrast, pianists should subtly change inflection and timing for the same phrase. Moreover, the current system can not infer performance information if there are no dynamics symbols in a score. To solve these problems, we need to develop a new versatile system that adds other methods to the current system.

CONFLICTS OF INTEREST

The authors declare they have no conflicts of interest.

AUTHORS INTRODUCTION

Mr. Ryo Kinoshita

He is a Master’s course student at the Department of Intelligent and Control Systems Kyushu Institute of Technology, Japan. He received his Bachelor’s degree from the Department of Mechanical Information Science and Technology Kyushu Institute of Technology, Japan, in 2020.

Prof. Eiji Hayashi

He is a Professor in the Department of Intelligent and Control Systems at Kyushu Institute of Technology. He received a PhD (Dr. Eng.) degree from Waseda University in 1996. His research interests include Intelligent mechanics, Mechanical Systems, and Perceptual information processing. He is a member of The Institute of Electrical and Electronics Engineers (IEEE) and The Japan Society of Mechanical Engineers (JSME).

Journal
Journal of Robotics, Networking and Artificial Life
Volume-Issue
8 - 2
Pages
99 - 103
Publication Date
2021/07/30
ISSN (Online)
2352-6386
ISSN (Print)
2405-9021
DOI
10.2991/jrnal.k.210713.006How to use a DOI?
Copyright
© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Ryo Kinoshita
AU  - Eiji Hayashi
PY  - 2021
DA  - 2021/07/30
TI  - Performance Information Editing System for Player Piano Aiming at Human-like Performance
JO  - Journal of Robotics, Networking and Artificial Life
SP  - 99
EP  - 103
VL  - 8
IS  - 2
SN  - 2352-6386
UR  - https://doi.org/10.2991/jrnal.k.210713.006
DO  - 10.2991/jrnal.k.210713.006
ID  - Kinoshita2021
ER  -