E. Benetos, S. Dixon, Z. Duan, and S. Ewert, Automatic music transcription: An overview, IEEE Signal Processing Magazine, vol.36, issue.1, pp.20-30, 2019.

S. Böck, E. P. Matthew, P. Davies, and . Knees, Multi-task learning of tempo and beat: Learning one to improve the other, ISMIR, 2019.

J. Calvo-zaragoza, J. Hajic, and A. Pacha, Understanding optical music recognition. Computer Research Repository, 2019.

C. E. Cancino-chacón, M. Grachten, W. Goebl, and G. Widmer, Computational models of expressive music performance: A comprehensive and critical review, Frontiers in Digital Humanities, vol.5, p.25, 2018.

R. Gunter, C. Carvalho, and P. Smaragdis, Towards end-to-end polyphonic music transcription: Transforming music audio directly to a score, WAS-PAA, pp.151-155, 2017.

A. Cogliati and Z. Duan, A metric for music notation transcription accuracy, ISMIR, pp.407-413, 2017.

M. S. Cuthbert, C. Ariza, and L. Friedland, Feature extraction and machine learning on symbolic music using the music21 toolkit, ISMIR, pp.387-392, 2011.

V. Emiya, R. Badeau, and B. David, Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.6, pp.1643-1654, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00510392

F. Foscarin, D. Fiala, and F. Jacquemard, Philippe Rigaux, and Virginie Thion. Gioqoso, an online Quality Assessment Tool for Music Notation, 4th International Conference on Technologies for Music Notation and Representation (TENOR'18), 2018.

T. Gadermaier and G. Widmer, A study of annotation and alignment accuracy for performance comparison in complex orchestral music, 2019.

W. Goebl, Numerisch-klassifikatorische interpretationsanalyse mit dem, Bösendorfer Computerflügel, 1999.

F. Gouyon, A. Klapuri, S. Dixon, M. Alonso, G. Tzanetakis et al., An experimental comparison of audio tempo induction algorithms, IEEE Transactions on Audio, Speech, and Language Processing, vol.14, issue.5, pp.1832-1844, 2006.

P. Grosche, M. Müller, and C. S. Sapp, What makes beat tracking difficult? a case study on chopin mazurkas, ISMIR, pp.649-654, 2010.

W. Stephen, M. Hainsworth, and . Macleod, Particle filtering applied to musical tempo tracking, EURASIP Journal on Advances in Signal Processing, issue.15, p.927847, 2004.

M. Hashida, T. Matsui, and H. Katayose, A new music database describing deviation information of performance expressions, IS-MIR, pp.489-494, 2008.

C. Hawthorne, E. Elsen, J. Song, A. Roberts, I. Simon et al., Onsets and frames: Dualobjective piano transcription, In ISMIR, 2018.

C. Hawthorne, A. Stasyuk, A. Roberts, I. Simon, C. Huang et al., Enabling factorized piano music modeling and generation with the MAESTRO dataset, International Conference on Learning Representations, 2019.

A. Holzapfel, E. P. Matthew, . Davies, R. José, J. Zapata et al., Selective sampling for beat tracking evaluation, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.9, pp.2539-2548, 2012.

C. Huang, A. Vaswani, J. Uszkoreit, I. Simon, C. Hawthorne et al., Music transformer: Generating music with long-term structure, International Conference on Learning Representations, 2018.

D. Jeong, T. Kwon, Y. Kim, K. Lee, and J. Nam, VirtuosoNet: A hierarchical RNN-based system for modeling expressive piano performance, ISMIR, pp.908-915, 2019.

R. Kelz, M. Dorfer, F. Korzeniowski, S. Böck, A. Arzt et al., On the potential of simple framewise approaches to piano transcription, ISMIR, pp.475-481, 2016.

J. Kim and J. P. Bello, Adversarial learning for improved onsets and frames music transcription, ISMIR, pp.670-677, 2019.

F. Krebs, S. Böck, and G. Widmer, Rhythmic pattern modeling for beat and downbeat tracking in musical audio, ISMIR, pp.227-232, 2013.

U. Marchand and G. Peeters, Swing ratio estimation, Digital Audio Effects (DAFx), pp.423-428, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01252603

A. Mcleod, E. Nakamura, and K. Yoshii, Improved metrical alignment of MIDI performance based on a repetition-aware online-adapted grammar, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp.186-190, 2019.

A. Mcleod and M. Steedman, HMM-based voice separation of MIDI performance, Journal of New Music Research, vol.45, issue.1, pp.17-26, 2016.

A. Mcleod and M. Steedman, Evaluating automatic polyphonic music transcription, ISMIR, pp.42-49, 2018.

M. Müller, V. Konz, W. Bogler, and V. Arifi-müller, Saarland music data (SMD). In Late-Breaking and Demo Session of ISMIR, 2011.

E. Nakamura, E. Benetos, K. Yoshii, and S. Dixon, Towards complete polyphonic music transcription: Integrating multi-pitch detection and rhythm quantization, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp.101-105, 2018.

E. Nakamura, K. Yoshii, and H. Katayose, Performance error detection and postprocessing for fast and accurate symbolic music alignment, ISMIR, pp.347-353, 2017.

R. Nishikimi, E. Nakamura, S. Fukayama, M. Goto, and K. Yoshii, Automatic singing transcription based on encoder-decoder recurrent neural networks with a weakly-supervised attention mechanism, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.161-165, 2019.

A. Olmos, N. Bouillot, T. Knight, N. Mabire, J. Redel et al., A high-fidelity orchestra simulator for individual musicians' practice, Computer Music Journal, vol.36, issue.2, pp.55-73, 2012.

C. Raffel, P. W. Daniel, and . Ellis, Intuitive analysis, creation and manipulation of midi data with pretty_midi, ISMIR Late Breaking and Demo Papers, 2014.

P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, Squad: 100,000+ questions for machine comprehension of text, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp.2383-2392, 2016.

M. A. Román, A. Pertusa, and J. Calvo-zaragoza, A holistic approach to polyphonic music transcription with neural netwoks, ISMIR, pp.731-737, 2019.

Z. Shi, C. S. Sapp, K. Arul, J. Mcbride, and J. O. Smith, SUPRA: Digitizing the stanford university piano roll archive, ISMIR, pp.517-523, 2019.

D. Stoller, S. Ewert, and S. Dixon, Wave-U-Net: A multi-scale neural network for end-toend audio source separation, ISMIR, pp.334-340, 2018.

G. Tzanetakis and P. Cook, Musical genre classification of audio signals, IEEE Transactions on speech and audio processing, vol.10, issue.5, pp.293-302, 2002.

A. Voulodimos and N. Doulamis, Anastasios Doulamis, and Eftychios Protopapadakis, Computational intelligence and neuroscience, 2018.

C. Weiß, V. Arifi-müller, T. Prätzlich, R. Kleinertz, and M. Müller, Analyzing measure annotations for western classical music recordings, ISMIR, pp.517-523, 2016.

A. Ycart and E. Benetos, A-MAPS: Augmented MAPS dataset with rhythm and key annotations, ISMIR Late Breaking and Demo Papers, 2018.

T. Young, D. Hazarika, S. Poria, and E. Cambria, Recent trends in deep learning based natural language processing, IEEE Computational Intelligence Magazine, vol.13, issue.3, pp.55-75, 2018.